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Preface 



In recent years, the discovery of new algorithms for dealing with polyno- 
mial equations, coupled with their implementation on inexpensive yet fast 
computers, has sparked a minor revolution in the study and practice of 
algebraic geometry. These algorithmic methods and techniques have also 
given rise to some exciting new applications of algebraic geometry. 

One of the goals of Using Algebraic Geometry is to illustrate the many 
uses of algebraic geometry and to highlight the more recent applications 
of Grobner bases and resultants. In order to do this, we also provide an 
introduction to some algebraic objects and techniques more advanced than 
one typically encounters in a first course, but which are nonetheless of 
great utility. Finally, we wanted to write a book which would be accessible 
to nonspecialists and to readers with a diverse range of backgrounds. 

To keep the book reasonably short, we often have to refer to basic re- 
sults in algebraic geometry without proof, although complete references are 
given. For readers learning algebraic geometry and Grobner bases for the 
first time, we would recommend that they read this book in conjunction 
with one of the following introductions to these subjects: 

• Introduction to Grobner Bases, by Adams and Loustaunau [AL] 

• Grobner Bases, by Becker and Weispfenning [BW] 

• Ideals, Varieties and Algorithms, by Cox, Little and O’Shea [CLO] 

We have tried, on the other hand, to keep the exposition self-contained 
outside of references to these introductory texts. We have made no effort 
at completeness, and have not hesitated to point out the reader to the 
research literature for more information. 

Later in the preface we will give a brief summary of what our book covers. 



The Level of the Text 

This book is written at the graduate level and hence assumes the reader 
knows the material covered in standard undergraduate courses, including 
abstract algebra. But because the text is intended for beginning graduate 
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students, it does not require graduate algebra, and in particular, the book 
does not assume that the reader is familiar with modules. Being a graduate 
text. Using Algebraic Geometry covers more sophisticated topics and has 
a denser exposition than most undergraduate texts, including our previous 
book [CLO]. 

However, it is possible to use this book at the undergraduate level, pro- 
vided proper precautions are taken. With the exception of the first two 
chapters, we found that most undergraduates needed help reading prelimi- 
nary versions of the text. That said, if one supplements the other chapters 
with simpler exercises and fuller explanantions, many of the applications we 
cover make good topics for an upper-level undergraduate applied algebra 
course. Similarly, the book could also be used for reading courses or senior 
theses at this level. We hope that our book will encourage instructors to 
find creative ways for involving advanced undergraduates in this wonderful 
mathematics. 



How to Use the Text 

The book covers a variety of topics, which can be grouped roughly as 
follows: 

• Chapters 1 and 2: Grobner bases, including basic definitions, algorithms 
and theorems, together with solving equations, eigenvalue methods, and 
solutions over M. 

• Chapters 3 and 7: Resultants, including multipolynomial and sparse 
resultants as well as their relation to poly topes, mixed volumes, toric 
varieties, and solving equations. 

• Chapters 4, 5 and 6: Commutative algebra, including local rings, stan- 
dard bases, modules, syzygies, free resolutions, Hilbert functions and 
geometric applications. 

• Chapters 8 and 9: Applications, including integer programming, combi- 
natorics, polynomial splines, and algebraic coding theory. 

One unusual feature of the book’s organization is the early introduction 
of resultants in Chapter 3. This is because there are many applications 
where resultant methods are much more eSicient that Grobner basis meth- 
ods. While Grobner basis methods have had a greater theoretical impact on 
algebraic geometry, resultants appear to have an advantage when it comes 
to practical applications. There is also some lovely mathematics connected 
with resultants. 

There is a large degree of independence among most chapters of the book. 
This implies that there are many ways the book can be used in teaching a 
course. Since there is more material than can be covered in one semester, 
some choices are necessary. Here are three examples of how to structure a 
course using our text. 
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• Solving Equations. This course would focus on the use of Grobner bases 
and resultants to solve systems of polynomial equations. Chapters 1, 2, 
3 and 7 would form the heart of the course. Special emphasis would be 
placed on §5 of Chapter 2, §5 and §6 of Chapter 3, and §6 of Chapter 7. 
Optional topics would include §1 and §2 of Chapter 4, which discuss 
multiplicities. 

• Commutative Algebra. Here, the focus would be on topics from classical 
commutative algebra. The course would follow Chapters 1, 2, 4, 5 and 6, 
skipping only those parts of §2 of Chapter 4 which deal with resultants. 
The final section of Chapter 6 is a nice ending point for the course. 

• Applications. A course concentrating on applications would cover integer 
programming, combinatorics, splines and coding theory. After a quick 
trip through Chapters 1 and 2, the main focus would be Chapters 8 and 
9. Chapter 8 uses some ideas about polytopes from §1 of Chapter 7, 
and modules appear naturally in Chapters 8 and 9. Hence the first two 
sections of Chapter 5 would need to be covered. Also, Chapters 8 and 
9 use Hilbert functions, which can be found in either Chapter 6 of this 
book or Chapter 9 of [CLO]. 

We want to emphasize that these are only three of many ways of using the 
text. We would be very interested in hearing from instructors who have 
found other paths through the book. 

References 

References to the bibliography at the end of the book are by the first three 
letters of the author’s last name (e.g., [Hil] for Hilbert), with numbers for 
multiple papers by the same author (e.g., [Mad] for the first paper by 
Macaulay). When there is more than one author, the first letters of the 
authors’ last names are used (e.g., [BE] for Buchsbaum and Eisenbud), 
and when several sets of authors have the same initials, other letters are 
used to distinguish them (e.g., [BoF] is by Bonnesen and Fenchel, while 
[BuF] is by Burden and Faires). 

The bibliography lists books alphabetically by the full author’s name, 
followed (if applicable) by any coauthors. This means, for instance, that 
[BS] by Billera and Sturmfels is listed before [Bla] by Blahut. 



Comments and Corrections 



We encourage comments, criticism, and corrections. Please send them to 
any of us: 



David Cox 
John Little 
Don O’Shea 



dac@cs.amherst.edu 
little@math.holycross.edu 
doshea@mhc . mt holyoke . edu 
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For each new typo or error, we will pay $1 to the first person who reports 
it to us. We also encourage readers to check out the web site for Using 
Algebraic Geometry^ which is at 

http : //WWW . cs . amherst . edu/"dac/uag . html 

This site includes updates and errata sheets, as well as links to other sites 
of interest. 
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Chapter 1 

Introduction 



Algebraic geometry is the study of geometric objects defined by polynomial 
equations, using algebraic means. Its roots go back to Descartes’ introduc- 
tion of coordinates to describe points in Euclidean space and his idea of 
describing curves and surfaces by algebraic equations. Over the long his- 
tory of the subject, both powerful general theories and detailed knowledge 
of many specific examples have been developed. Recently, with the devel- 
opment of computer algebra systems and the discovery (or rediscovery) of 
algorithmic approaches to many of the basic computations, the techniques 
of algebraic geometry have also found significant applications, for example 
in geometric design, combinatorics, integer programming, coding theory, 
and robotics. Our goal in Using Algebraic Geometry is to survey these 
algorithmic approaches and many of their applications. 

For the convenience of the reader, in this introductory chapter we will 
first recall the basic algebraic structure of ideals in polynomial rings. In §2 
and §3 we will present a rapid summary of the Grobner basis algorithms de- 
veloped by Buchberger for computations in polynomial rings, with several 
worked out examples. Finally, in §4 we will recall the geometric notion of 
an affine algebraic variety, the simplest type of geometric object defined by 
polynomial equations. The topics in §1, §2, and §3 are the common prereq- 
uisites for all of the following chapters. §4 gives the geometric context for 
the algebra from the earlier sections. We will make use of this language at 
many points. If these topics are familiar, you may wish to proceed directly 
to the later material and refer back to this introduction as needed. 



§1 Polynomials and Ideals 

To begin, we will recall some terminology. A monomial in a collection of 
variables ... ,Xn is a product 

( 1 . 1 ) 
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where the ai are non-negative integers. To abbreviate, we will sometimes 
rewrite (1.1) as where a = (ai, . . . , an) is the vector of exponents in the 
monomial. The total degree of a monomial is the sum of the exponents: 
0^1 + • • • + o:„. We will often denote the total degree of the monomial x" 
by \a\. For instance x 1X2X4 is a monomial of total degree 6 in the variables 
xi, X2, X3, X4, since a = ( 3 , 2, 0 , 1 ) and \a\ = 6. 

If k is any field, we can form finite linear combinations of monomials 
with coefficients in k. The resulting objects are known as polynomials in 
xi, . . . , Xn- We will also use the word term on occasion to refer to a product 
of a nonzero element of k and a monomial appearing in a polynomial. Thus, 
a general polynomial in the variables xi, . . . , Xn with coefficients in k has 
the form 



/ = H 

a 

where Ca £ k for each a, and there are only finitely many terms c^x" in 
the sum. For example, taking k to be the field Q of rational numbers, and 
denoting the variables by x,y^z rather than using subscripts, 

(1.2) p = - z - I 

is a polynomial containing four terms. 

In most of our examples, the field of coefficients will be either Q, the 
field of real numbers, R, or the field of complex numbers, C. Polynomi- 
als over finite fields will also be introduced in Chapter 9 . We will denote 
by k[xi , . . . , Xn] the collection of all polynomials in xi, . . . , Xn with co- 
efficients in fc. Polynomials in fc[xi, . . . ,Xn] can be added and multiplied 
as usual, so fc[xi, . . . ,Xn] has the structure of a commutative ring (with 
identity). However, only nonzero constant polynomials have multiplicative 
inverses in k[xi, , Xn], so k[xi, ... , Xn] is not a field. However, the set 
of rational functions {f/g : f,g £ k[xi , . . . , Xn], ^ 0} is a field, denoted 
fc(xi, . . . ,Xn). 

A polynomial / is said to be homogeneous if all the monomials appearing 
in it with nonzero coefficients have the same total degree. For instance, 
/ = 4 x^ 4- 5 xy^ — 2:^ is a homogeneous polynomial of total degree 3 in 
Q[x, 2/, z], while g = 4 x^ 4- 5 x 2 /^ — z^ is not homogeneous. When we study 
resultants in Chapter 3 , homogeneous polynomials will play an important 
role. 

Given a collection of polynomials, /i, . . . , /^ G fc[xi, . . . , Xn], we can 
consider all polynomials which can be built up from these by multiplication 
by arbitrary polynomials and by taking sums. 

( 1 . 3 ) Definition. Let G fc[xi, . . . , Xnj. We let 

denote the collection 



ifi, = {pifi H + Psfs : Pi e A:[xi, . . . , x„] for i = 1 , . . . , s}. 
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For example, consider the polynomial p from (1.2) above and the two 
polynomials 

fl = -1 

f2 = + y^ + {z - 1)^ - 4. 

We have 

, . P = x'^ + ^y^z-z-l 

= (- 5 z + l)(x^ + z^ - 1) + {^ z){x'^ +y^ + {z - 1)^ - 4). 

This shows p e (/i, / 2 ). 



Exercise 1. 

a. Show that G {x — xy) mk[x^y] {k any field). 

b. Show that {x — y^^xy^y"^) = {x^y^). 

c. Is {x — y^, xy) = xy)7 Why or why not? 

Exercise 2. Show that (/i, . . . , fs) is closed under sums in k[xi , . . . , Xn]- 
Also show that if / G (/i, . . . , and p G k[xi, . , .,Xn] is an arbitrary 
polynomial, then p - f e (/i, . . • , /s)- 

The two properties in Exercise 2 are the defining properties of ideals in 
the ring k[x \^ . . . , Xn]- 

(1.5) Definition. Let I C fc[xi, . . . , be a non-empty subset. I is said 
to be an ideal if 

a* / 4- ^ G / whenever / G / and g E I, and 

h. pf e I whenever f e I, and p G k[xi, . . . ,Xn] is an arbitrary 
polynomial. 

Thus (/i, . . . , /s) is an ideal by Exercise 2. We will call it the ideal 
generated by fi, fs because it has the following property. 

Exercise 3. Show that (/i, . . . , fs) is the smallest ideal in k[xi , . . . , Xn] 
containing in the sense that if J is any ideal containing 

/i, . . . then (/i, ...Js) C J. 

Exercise 4. Using Exercise 3, formulate and prove a general criterion for 
equality of ideals I = {/i, and J = {gi, . . . ,gt) in k[xi, . . . , Xn\ 
How does your statement relate to what you did in part b of Exercise 1? 

Given an ideal, or several ideals, in fc[xi, . . . , Xn], there are a number of 
algebraic constructions that yield other ideals. One of the most important 
of these for geometry is the following. 
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(1.6) Definition. Let I C k[xi ^ . . . , Xn] be an ideal. The radical of I is 
the set 

y/1 = {p G k[x \^ . . . , Xn] : ^ I for some m > 1} 

An ideal I is said to be a radical ideal if y/l = I. 

For instance, 

X -\-y e yj {x^ + 3x2/, Sxy + y^) 

in Q[x, y] since 

{x + 2 /)^ = x{x^ + ^xy) + y{^xy + y^) G + ^xy, Zxy + y^). 

Since each of the generators of the ideal + 3x2/, 3x2/ +2/^) is homogeneous 

of degree 2, it is clear that x + 2 / ^ + 3x2/, 3x2/ “1~ 2/^)- If follows that 

(x^ + 3x2/, 3x2/ + 2/^) is not a radical ideal. 

Although it is not obvious from the definition, we have the following 
property of the radical. 

• (Radical Ideal Property) For every ideal I C fc[xi, . . . ,Xn], \// is an 
ideal containing I. 

See [CLO], Chapter 4, §2, for example. We will consider a number of other 
operations on ideals in the exercises. 

One of the most important general facts about ideals in k[xi , . . . , Xn] is 
known as the Hilbert Basis Theorem. In this context, a basis is another 
name for a generating set for an ideal. 

• (Hilbert Basis Theorem) Every ideal / in fc[xi, . . . , Xn] has a finite gener- 
ating set. In other words, given an ideal /, there exists a finite collection 
of polynomials {/i, . . . , /s} C k[xi, . . . , Xn] such that I = (/i, . . . , /«). 

For polynomials in one variable, this is a standard consequence of the one- 
variable polynomial division algorithm. 

• (Division Algorithm in k[x]) Given two polynomials f,g G fc[x], we can 
divide / by producing a unique quotient q and remainder r such that 

f = qg-\-r, 

and either r = 0, or r has degree strictly smaller than the degree of g. 

See, for instance, [CLO], Chapter 1, §5. The consequences of this result for 
ideals in A;[x] are discussed in Exercise 6 below. For polynomials in several 
variables, the Hilbert Basis Theorem can be proved either as a byproduct of 
the theory of Grobner bases to be reviewed in the next section (see [CLO], 
Chapter 2, §5), or inductively by showing that if every ideal in a ring R is 
finitely generated, then the same is true in the ring R[x] (see [AL], Chapter 
1, §1, or [BW], Chapter 4, §1). 
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Additional Exercises for §1 

Exercise 5. Show that {y — x‘^,z — x^) = {z — xy, y — in Q[a:, y, z]. 

Exercise 6. Let k be any field, and consider the polynomial ring in one 
variable, k[x]. In this exercise, you will give one proof that every ideal in 
k[x] is finitely generated. In fact, every ideal I C k[x] is generated by a 
single polynomial: I = {g) for some g. We may assume I ^ {0} for there is 
nothing to prove in that case. Let ^ be a nonzero element in I of minimal 
degree. Show using the division algorithm that every / in / is divisible by 
g. Deduce that I = {g). 

Exercise 7. 

a. Let k be any field, and let n be any positive integer. Show that in k[x], 

= (x). 

b. More generally, suppose that 

p{x) = {x - aiY^ • • • (x - amY^- 

What is ^/^p{x))? 

c. Let A: = C, so that every polynomial in one variable factors as in b. 
What are the radical ideals in C[x]? 

Exercise 8. An ideal I C fc[xi, . . . , Xn] is said to be prime if whenever a 
product fg belongs to /, either f e I, or g e I {or both). 

a. Show that a prime ideal is radical. 

b. What are the prime ideals in C[x]? What about the prime ideals in R[x] 
or Q[x]7 

Exercise 9. An ideal I C fc[xi, . . . ,Xn] is said to be maximal if there 
are no ideals J satisfying / C J C k[xi ^ . . . ,Xn] other than J = I and 
J — • . . , Xr}\‘ 

a. Show that (xi, X 2 , . . . , Xn) is a maximal ideal in k[xi ^ . . . , Xn\- 

b. More generally show that if (ai, . . . , a^) is any point in then the 

ideal (xi - ai, . . . , - On) C k[x \, . . . , Xn] is maximal. 

c. Show that / = + 1} is a maximal ideal in R[x]. Is I maximal 

considered as an ideal in C[x]? 

Exercise 10. Let I be an ideal in fc[xi, . . . , let ^ > 1 be an integer, 
and let li consist of the elements in I that do not depend on the first £ 
variables: 



le = I nk[xi^i,. . . ,Xn]- 
li is called the ^th elimination ideal of I. 

a. For / = + 2 /^, — z^) C k[x, y, z], show that y^ + z^ is in the first 

elimination ideal 
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b. Prove that le is an ideal in the ring k[x£^i , . . . , Xn\. 



Exercise 11. Let /, J be ideals in fc[xi, . . . , x^], and define 
^ + + / G/jS'GJ}. 

a. Show that / + J is an ideal in fc[xi, . . . , x^]. 

b. Show that / + J is the smallest ideal containing / U J. 

c. If / = (/i, . . . , /s) and J = (^ 1 , . . . , gt), what is a finite generating set 
for / -f J? 

Exercise 12. Let /, J be ideals in fc[xi, . . . , Xn]- 

a. Show that / fl J is also an ideal in fc[xi, . . . , Xn]. 

b. Define / J to be the smallest ideal containing all the products fg where 
f E I, and g G J. Show that IJ C I D J. Give an example where 
7J ^ 7 n J. 

Exercise 13. Let 7, J be ideals in fc[xi, . . . , x^], and define 7: J (called 

the quotient ideal of 7 by J) by 

7: J = {/ C fc[xi, . . . , Xn] : /p G 7 for all g G J). 

a. Show that 7: J is an ideal in fc[xi, . . . , Xnj- 

b. Show that if 7 fl (ft) = (^i, . . . , gt) (so each gi is divisible by ft), then a 
basis for 7: (ft) is obtained by cancelling the factor of ft from each gf, 

7;(ft) = (^i/ft,...,^t/ft>. 



§2 Monomial Orders and Polynomial Division 

The examples of ideals that we considered in §1 were artificially simple. In 
general, it can be difficult to determine by inspection or by trial and error 
whether a given polynomial / G fc[xi, . . . ,Xn] is an element of a given 
ideal 7 = (/i, . . . , /g), or whether two ideals 7 = (/i, . . . , /s) and J = 
(^ 1 , • • • , fl^t) are equal. In this section and the next one, we will consider a 
collection of algorithms that can be used to solve problems such as deciding 
ideal membership, deciding ideal equality, computing ideal intersections 
and quotients, and computing elimination ideals. See the exercises at the 
end of §3 for some examples. 

The starting point for these algorithms is, in a sense, the polynomial 
division algorithm in fc[x] introduced at the end of §1. In Exercise 6 of §1, 
we saw that the division algorithm implies that every ideal 7 C A:[x] has 
the form 7 = {g) for some g. Hence, if / G fc[x], we can also use division 
to determine whether / G 7. 
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Exercise 1. Let / = (g) in k[x] and let / G k[x] be any polynomial. Let 
g, r be the unique quotient and remainder in the expression f = qg -\- r 
produced by polynomial division. Show that / E / if and only if r = 0. 

Exercise 2. Formulate and prove a criterion for equality of ideals I\ = 
{gi) and I 2 = {^ 2 ) in k[x] based on division. 

Given the usefulness of division for polynomials in one variable, we may 
ask: Is there a corresponding notion for polynomials in several variables? 
The answer is yes^ and to describe it, we need to begin by considering 
different ways to order the monomials appearing within a polynomial. 

(2.1) Definition. A monomial order on fc[xi, . . . , Xn] is any relation > on 
the set of monomials x^ in fc[xi, . . . , Xn] (or equivalently on the exponent 
vectors a satisfying: 

a. > is a total (linear) ordering relation. 

b. > is compatible with multiplication in fc[xi, . . . , in the sense that if 

x^ > x^ and x'^ is any monomial, then x^x^ = = x^x'^ . 

c. > is a well-ordering. That is, every non-empty collection of monomials 
has a smallest element under >. 

Condition a implies that the terms appearing within any polynomial / 
can be uniquely listed in increasing or decreasing order under >. Then 
condition b shows that that ordering does not change if we multiply / by 
a monomial x'^ . Finally, condition c is used to ensure that processes that 
work on collections of monomials, e.g. the collection of all monomials less 
than some fixed monomial x", will terminate in a finite number of steps. 

The division algorithm in k[x] makes use of a monomial order implicitly: 
When we divide g into / by hand, we always compare the leading term 
(the term of highest degree) in g with the leading term of the intermediate 
dividend. In fact there is no choice in the matter in this case. 

Exercise 3. Show that the only monomial order on k[x] is the degree order 
on monomials, given by 

• • • > > x” > • • • > X^ > X^ > X > 1. 

For polynomial rings in several variables, there are many choices of mono- 
mial orders. In writing the exponent vectors a and (3 in monomials x^ and 
x^ as ordered n-tuples, we implicitly set up an ordering on the variables Xi 
in k[xi, . . . ,Xn\: 



Xi > X2 > • • • > Xn- 

With this choice, there are still many ways to define monomial orders. Two 
of the most important are given in the following definitions. 
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(2.2) Definition (Lexicographic Order). Let and be monomials 
in k[x \^ . . . , Xn]- We say x^ >iex if in the difference a — /? G Z^, the 
left-most nonzero entry is positive. 

Lexicographic order is analogous to the ordering of words used in 
dictionaries. 

(2.3) Definition (Graded Reverse Lexicographic Order). Let x^ 

and x^ be monomials in . . . , Xn\. We say x^ >greviex if XlILi ^ 
Sr=i Sr=i — Sr=i difference a — /? G Z^, the 

right-most nonzero entry is negative. 

For instance in k[x, y, z]^ with x > y > z, we have 

(2.4) x^y^z >iex x^y^z^"^ 

since when we compute the difference of the exponent vectors: 

(3,2, l)-(2,6, 12) = (1,-4, -11), 
the left-most nonzero entry is positive. Similarly, 

X^y^ >iex X^y^^z 

since in (3, 6, 0) — (3, 4, 1) = (0, 2, —1), the leftmost nonzero entry is posi- 
tive. Comparing the lex and grevlex orders shows that the results can be 
quite different. For instance, it is true that 

X y Z ^grevlex X y Z. 

Compare this with (2.4), which contains the same monomials. Indeed, lex 
and grevlex are different orderings even on the monomials of the same total 
degree in three or more variables, as we can see by considering pairs of 
monomials such as x^y'^z^^ and xy^z. Since (2, 2, 2) — (1, 4, 1) = (1, —2, 1), 

x^y^z^ >iex xy^z. 

On the other hand by the Definition (2.3), 

Xy Z ^ grevlex X y Z . 

Exercise 4. Show that both >iex and >greviex are monomial orders in 
fe[xi, . . . , Xn] according to Definition (2.1). 

Exercise 5. Show that the monomials of a fixed total degree d in two 
variables x > y are ordered in the same sequence by >iex and > grevlex- 
Are these orderings the same on all of fc[x, y] though? Why or why not? 

The natural generalization of the leading term (term of highest degree) in 
a polynomial in k[x] is defined as follows. Picking any particular monomial 
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order > on k[xi , . . . , Xn], we consider the terms in / = J2a Then 

the leading term of / (with respect to >) is the product where x" 
is the largest monomial appearing in / in the ordering >. We will use the 
notation lt>(/) for the leading term, or just lt(/) if there is no chance of 
confusion about which monomial order is being used. 

For example, consider / = 3x^2/^ + x^yz^ in Q[x, y, z] (with variables 
ordered x > y > z ols usual). We have 

LT>,e.(/) = 3xV 

since x^y“^ >iex x^yz^. On the other hand 

= X^yz^ 

since the total degree of the second term is 6 and the total degree of the 
first is 5. 

Choosing any monomial order in fc[xi, . . . , Xn] gives all the information 
necessary to establish a generalized division algorithm. 

• (Division Algorithm in A:[xi, . . . , x^]) Fix any monomial order > in 
k[xi , . . . , Xn], and let F = (/i, . . . , /s) be an ordered s-tuple of poly- 
nomials in k[xi , . . . , Xn]. Then every / G fc[xi, . . . , Xn] can be written 
as: 

(2.5) / = aifi -!-••• + asfs + 

where ai,r e k[xi , . . . , Xn], and either r = 0, or r is a linear combination 
of monomials, none of which is divisible by any of lt>(/i), . . . , lt>(/s). 
We will call r a remainder of / on division by F. 

[CLO], Chapter 2, §3, and [AL], Chapter 1, §5 give one particular algo- 
rithmic form of the division process, in which the intermediate dividend 
is reduced at each step using the divisor fi with the smallest possible i 
such that Lx(/i) divides the leading term of the intermediate dividend. A 
characterization of the expression (2.5) that is produced by this version 
of division can be found in Exercise 11 of Chapter 2, §3 of [CLO]. [AL] 
and [BW], Chapter 5, §1 also consider more general forms of division or 
polynomial reduction procedures. 

You should note two differences between this statement and the division 
algorithm in fc[x]. First, we are allowing the possibility of dividing / by 
an s-tuple of polynomials with s > 1. The reason for this is that we will 
usually want to think of the divisors fi as generators for some particular 
ideal /, and ideals in fc[xi, . . . , x^] for n > 2 might not be generated by 
any single polynomial. Second, although any algorithmic version of division, 
such as the one presented in Chapter 2 of [CLO], produces one particular 
expression of the form (2.5) for each ordered s-tuple F and each /, there are 
always different expressions of this form for a given / as well. Reordering 
F or changing the monomial order can produce different ai and r in some 
cases. See Exercises 8 and 9 below for some examples. 
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We will sometimes use the notation 

(2.6) r = f^ 
for a remainder on division by F. 

Most computer algebra systems that have Grobner basis packages pro- 
vide implementations of some form of the division algorithm. However, in 

—F 

most cases the output of the division command is just the remainder / , 
the quotients are not saved or displayed, and an algorithm different from 
the one described in [CLO], Chapter 2, §3 may be used. For instance, the 
Maple grobner package contains a function normalf which computes a 
remainder on division of a polynomial by any collection of polynomials. 
To use it, one must start by loading the grobner package (just once in a 
session) with 

with (grobner) ; 

The format for the normalf command is 

normalf (f, F, vars, border); 

where f is the dividend polynomial, F is the ordered list of divisors (in 
square brackets, separated by commas), vars is the ordered list of variables 
(also in square brackets, separated by commas), and border is either plex 
for >iex or tdeg for >greviex- For instance, if we list [x,y] for vars and 
plex for border, then we get the >iex order with x > y. Let us consider 
dividing fi = — x and /2 = xy^ + y into / = using the 

lex order on Q[x, y] with x > y. The Maple commands 

f := x''3*y"2 + 2*x*y‘"4; 

(2.7) F := [x^2*y''2 - x, x*y^3 + y] ; 
normalf (f , F , [x , y] , plex) ; 

will produce as output 

(2.8) - 2y^. 

Thus the remainder is / = x^ — 2y^. The results from normalf may be 

different from those computed by the algorithm from [CLO], Chapter 2, §3 
for some inputs. 



Additional Exercises for §2 

Exercise 6. Verify by hand that the remainder from (2.8) occurs in an 
expression 

/ = CLifi + <^2/2 + x^ — 2y^ , 

where a\ = x, tt 2 = 2y, and fi are as in the discussion before (2.7). 
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Exercise 7. Show that reordering the variables and changing the mono- 
mial order to tdeg has no effect in (2.7). 

Exercise 8. What happens if you change F in (2.7) to 
F = - x^, xy^ - y‘^\ 

and take / = x^y^. Does changing the order of the variables make a 
difference now? 

Exercise 9. Now change F to 

F = [x^y^ - z*, xy^ - y^], 

take / = change vars to [x, y, z] (and permutations of this list) 

and change the monomial order. What do you observe? 



§3 Grobner Bases 

Since we now have a division algorithm in k[x\, . . . ,Xn] that seems to 
have many of the same features as the one-variable version, it is natural 
to ask if deciding whether a given / E k[xi, . . . ,Xn] is a member of a 
given ideal I = {/i, . . . , /s) can be done along the lines of Exercise 1 in 
§2, by computing the remainder on division. One direction is easy. Namely, 
from (2.5) it follows that if r = /^ = 0 on dividing by F = (/i, . . . , /«), 

then / = aifi H h a^/^. By definition then, / E (/i, . . . , /«). On the 

other hand, the following exercise shows that we are not guaranteed to get 

jp 

f = 0 for every / € (/i, . . . , /s) if we use an arbitrary basis F for I. 

Exercise 1. Recall from (1.4) that p = x^ ^y^z — z — 1 is an element 
of the ideal / = + 2 :^ - l,x^ -f + (; 2 ? - 1)^ - 4). Show, however, 

that the remainder on division of p by this generating set F is not zero. 
For instance, using >/ex» we get a remainder 



What went wrong here? Prom (2.5) and the fact that / E / in this case, 
it follows that the remainder is also an element of I. However, p^ is not 
zero because it contains terms that cannot be removed by division by these 
particular generators for I. The leading terms of /i = + 2 ?^ — 1 and 

/2 = -h y^ -f ( 2 ? — 1)^ — 4 do not divide the leading term of p^ , In order 
for division to produce zero remainders for all elements of /, we need to be 
able to remove all leading terms of elements of I using the leading terms 
of the divisors. That is the motivation for the following definition. 
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(3.1) Definition. Fix a monomial order > on , Xn]^ and let I C 

k[x \^ . . . , Xn] be an ideal. A Grobner basis for / (with respect to >) is a 
finite collection of polynomials G = {gi, • • • ,9t} C I with the property 
that for every nonzero f £ I, lt(/) is divisible by hT{gi) for some i. 

We will see in a moment (Exercise 3) that a Grobner basis for I is indeed 
a basis for /, i.e., I = (^i, . . . , ^f^). Of course, it must be proved that 
Grobner bases exist for all I in k[x\^ . . . , Xn]- This can be done in a non- 
constructive way by considering the ideal (lt(/)) generated by the leading 
terms of all the elements in I (a monomial ideal). By a direct argument 
(Dickson’s Lemma: see [CLO], Chapter 2, §4, or [BW], Chapter 4, §3, or 
[AL], Chapter 1 §4), or by the Hilbert Basis Theorem, the ideal (lt(/)) has 
a finite generating set consisting of monomials for i == 1, . . . , t. By the 
definition of (lt(/)), there is an element gi £ I such that ur{gi) = 
for each i = 1, . . . , L 

Exercise 2. Show that if (lt(/)) = and if G / are 

polynomials such that LT(yi) = for each i = then G = 

{^'i, • • • , S't} is a Grobner basis for I. 

Remainders computed by division with respect to a Grobner basis are 
much better behaved than those computed with respect to arbitrary sets 
of divisors. For instance, we have the following results. 

Exercise 3. 

a. Show that if G is a Grobner basis for /, then for any / G /, the remainder 
on division of / by G (listed in any order) is zero. 

b. Deduce that / = (^i, • • • , if G = {^fi, . . . , is a Grobner basis for 
I. (If / = (0), then G = 0 and we make the convention that (0) = {0}.) 

Exercise 4. If G is a Grobner basis for an ideal /, and / is an arbitrary 
polynomial, show that if the algorithm of [CLO], Chapter 2, §3 is used, the 
remainder on division of / by G is independent of the ordering of G. Hint: 
If two different orderings of G are used, producing remainders ri and r 2 , 
consider the difference v\ — V 2 . 

Generalizing the result of Exercise 4, we also have the following important 
statement. 

• (Uniqueness of Remainders) Fix a monomial order > and let I C 
fc[xi, . . . , Xn] be an ideal. Division of f £ k[x\^ . . . , by a Grobner 
basis for I produces an expression f = g r where g £ I and no term 
in r is divisible by any element of lt( 7). If / = ^' + r' is any other such 
expression, then r = r'. 
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See [CLO], Chapter 2, §6, [AL], Chapter 1, §6, or [BW], Chapter 5, §2. 
In other words, the remainder on division of / by a Grobner basis for I 
is a uniquely determined normal form for / modulo I depending only on 
the choice of monomial order and not on the way the division is performed. 
Indeed, uniqueness of remainders gives another characterization of Grobner 
bases. 

More useful for many purposes than the existence proof for Grobner 
bases above is an algorithm, due to Buchberger, that takes an arbitrary 
generating set {/i, . . . , /s} for I and produces a Grobner basis G for I 
from it. This algorithm works by forming new elements of I using expres- 
sions guaranteed to cancel leading terms and uncover other possible leading 
terms, according to the following recipe. 

(3.2) Definition. Let f^g £ k[xi , . . . ,Xn] be nonzero. Fix a monomial 
order and let 



lt(/) = cx^ and lt(^) = dx^, 

where c,d e k. Let x'^ be the least common multiple of x^ and x^. The 
S -polynomial of / and g, denoted S{f, g), is the polynomial 



SU.9) 



X ^ 



lt(/) 



/- 



LT(g) 



• 9- 



Note that by definition S{f,g) e (/, ff)- For example, with / = x^y — 
2x^y^ + X and g == Zx'^ — J/ in Q[x, y\, and using >iex, we have x^ = 
and 



5(/, g) = xf - {y/Z)g = -2x^y^ + x^ + y^/Z. 

In this case, the leading term of the 5-polynomial is divisible by the 
leading term of /. We might consider taking the remainder on division by 
F = (/, g) to uncover possible new leading terms of elements in (/, g). And 
indeed in this case we find that the remainder is 

(3.3) S{f, gf = -Ax^y^ + x^ + 2xy + y^/Z 

and lt( 5(/, is divisible by neither lt(/) nor lt(^). An 

important result about this process of forming 5-polynomial remainders is 
the following statement. 

• (Buchberger ’s Criterion) A finite set G = {pi? • • • ? C / is a Grobner 

^ 

basis of I if and only if S{gi, gj) =0 for all pairs i ^ j- 

See [CLO], Chapter 2, §7, [BW], Chapter 5, §3, or [AL], Chapter 1, §7. 
Using this criterion above, we obtain a very rudimentary procedure for 
producing a Grobner basis of a given ideal. 
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• (Buchberger’s Algorithm) 

Input: F = 

Output: a Grobner basis G = {pi, ... ,gt} for I = (F), with F C G 
G:-=F 
REPEAT 
G' := G 

FOR each pair p ^ qmG' DO 
S := 

IF 5 0 THEN G := G U {5} 

UNTIL G = G' 

See [CLO], Chapter 2, §6, [BW], Chapter 5, §3, or [AL], Chapter 1, §7. For 

F 

instance, in the example above we would adjoin h = S{f^g) from (3.3) 
to our set of polynomials. There are two new S'-polynomials to consider 
now: 5(/, h) and S{g^ h). Their remainders on division by (/, g^ h) would 
be computed and adjoined to the collection if they are nonzero. Then we 
would continue, forming new S'-polynomials and remainders to determine 
whether further polynomials must be included. 

Exercise 5. Carry out Buchberger’s Algorithm on the example above, 
continuing from (3.3). (You may want to use a computer algebra system 
for this.) 

In Maple, there is an implementation of a more sophisticated version of 
Buchberger’s algorithm in the grobner package. The relevant command is 
called gbasis, and the format is 

gbasis (F , vars , torder) ; 

Here F is a list of polynomials, vars is the list of variables, and torder 
specifies the monomial order. See the description of the normalf command 
in §2 for more details. For instance, the commands 

F := [x"3*y - 2*x"2*y"2 + x,3*x"4 - y] ; 
gbasis (F, [x,y] ,plex) ; 

will compute a lex Grobner basis for the ideal from Exercise 4. The output 
is 

(3.4) [252x - 624y^ + 493j/^ - Zy, - 49y^ + 48^^® - 9y] 

(possibly up to the ordering of the terms, which can vary). This is not the 
same as the result of the rudimentary form of Buchberger’s algorithm given 
before. For instance, notice that neither of the polynomials in F actually 
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appears in the output. The reason is that the gbasis function actually 
computes what we will refer to as a reduced Grobner basis for the ideal 
generated by the list F. 

(3.5) Definition. A reduced Grobner basis for an ideal I C k[xi , . . . , Xn] 
is a Grobner basis G for I such that for all distinct p,q E G, no monomial 
appearing in p is a multiple of LT{q). A monic Grobner basis is a reduced 
Grobner basis in which the leading coeflBcient of every polynomial is 1, or 
0if/= (0). 

Exercise 6. Verify that (3.4) is a reduced Grobner basis according to this 
definition. 

Exercise 7. Compute a Grobner basis G for the ideal I from Exercise 1 
of this section. Verify that = 0 now, in agreement with the result of 
Exercise 3. 

A comment is in order concerning (3.5). Many authors include the con- 
dition that the leading coefficient of each element in G is 1 in the definition 
of a reduced Grobner basis. However, many computer algebra systems (in- 
cluding Maple, see (3.4)) do not perform that extra normalization because 
it often increases the amount of storage space needed for the Grobner basis 
elements when the coefficient field is Q. The reason that condition is often 
included, however, is the following statement. 

• (Uniqueness of Monic Grobner Bases) Fix a monomial order > on 

A:[xi, . . . , Xn]. Each ideal / in fc[xi, . . . , Xn] has a unique monic Grobner 

basis with respect to >. 

See [CLO], Chapter 2, §7, [AL], Chapter 1, §8, or [BW], Chapter 5, §2. 
Of course, varying the monomial order can change the reduced Grobner 
basis guaranteed by this result, and one reason different monomial orders 
are considered is that the corresponding Grobner bases can have different, 
useful properties. One interesting feature of (3.4), for instance, is that the 
second polynomial in the basis does not depend on x. In other words, it 
is an element of the elimination ideal / fl Q[2/j. In fact lex Grobner bases 
systematically eliminate variables. This is the content of the Elimination 
Theorem from [CLO], Chapter 3, §1. Also see Chapter 2, §1 of this book 
for further discussion and applications of this remark. On the other hand, 
the grevlex order often minimizes the amount of computation needed to 
produce a Grobner basis, so if no other special properties are required, it 
can be the best choice of monomial order. Other product orders and weight 
orders are used in many applications to produce Grobner bases with special 
properties. See Chapter 8 for some examples. 
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Additional Exercises for §3 

Exercise 8. Consider the ideal I = {x^y^ — xy^ + y) from (2.7). 

a. Using >iex in Q[x, i/], compute a Grobner basis G for I. 

b. Verify that each basis element g you obtain is in /, by exhibiting 
equations g = A{x‘^y‘^ — x) + B{xy^ 4- y) for suitable A, B e Q[a:, y]. 

c. Let / = x^y‘^ 4- 2xy"^. What is / ? How does this compare with the 
result in (2.7)? 

Exercise 9. What monomials can appear in remainders with respect to 
the Grbbner basis G in (3.4)? What monomials appear in leading terms of 
elements of the ideal generated by G? 

Exercise 10. Let G be a Grobner basis for an ideal I C k[xi^ . . . , Xn] and 
suppose there exist distinct p^q E G such that ur{p) is divisible by lt(^). 
Show that G \ {p} is also a Grobner basis for I. Use this observation, 
together with division, to propose an algorithm for producing a reduced 
Grobner basis for I given G as input. 

Exercise 11. This exercise will sketch a Grobner basis method for 
computing the intersection of two ideals. It relies on the Elimination 
Theorem for lex Grobner bases, as stated in [CLO], Chapter 3, §1. Let 
I = (/i, • . . , /s) C k[xi, . . . , Xn] be an ideal. Given f{t) an arbitrary 
polynomial in k[t], consider the ideal 

f{t)I = {f(t)fl, . . .,f{t)fs) C fc[xi, . . 

a. Let /, J be ideals in fc[xi, . . . , Xn]. Show that 

/ n J = (t/ + (1 — t)J) n fc[xi, . . . , Xn]- 

b. Using the Elimination Theorem, deduce that a Grobner basis G for / fl J 
can be found by first computing a Grobner basis H ior tl {1 — t)J 
using a lex order on fc[xi, . . . , t] with the variables ordered t > Xi 
for all i, and then letting G = H f] fc[xi, . . . , Xn]. 

Exercise 12. Using the result of Exercise 11, derive a Grobner basis 
method for computing the quotient ideal I:{h). Hint: Exercise 13 of §1 
shows that if / D (h) is generated by ^fi, . . . , yt, then I : {h) is generated by 
gi/h ^ . . . , gt/h. 



§4 AfRne Varieties 

We will call the set = {(ai, . . . , an) ai, . . . , G k} the affine n- 

dimensional space over k. With fc = R, for example, we have the usual 
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coordinatized Euclidean space Each polynomial / G k[xi^ , Xn] de- 
fines a function f : k. The value of / at (ai, . . . , 0 ^) G k^ is 

obtained by substituting Xi — and evaluating the resulting expres- 
sion in k. More precisely, if we write / = Yla for Ca ^ k, then 
f{ai, ■■■,an) = J2a ^ where 

a“ = a"' 

We recall the following basic fact. 

• (Zero Function) If k is an infinite field, then f : k^ k is the zero 
function if and only if / = 0 G k[xi, . . . , Xn]- 

See, for example, [CLO], Chapter 1, §1. As a consequence, when k is infinite, 
two polynomials define the same function on k^ if and only if they are equal 
in k[xi, . . . ,Xn]- 

The simplest geometric objects studied in algebraic geometry are the 
subsets of affine space defined by one or more polynomial equations. For 
instance, in M^, consider the set of (a:, y, z) satisfying the equation 

-^z‘^ -1 = 0, 

a circular cylinder of radius 1 along the y-axis (see Fig. 1.1). 

Note that any equation p — q, where p,q E k[xi , . . . , Xn], can be rewrit- 
ten as p — q = 0, so it is customary to write all equations in the form 
/ = 0 and we will always do this. More generally, we could consider the 
simultaneous solutions of a system of polynomial equations. 




Figure 1.1. Circular Cylinder 
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(4.1) Definition. The set of all simultaneous solutions (ai, . . . , a^) G 
of a system of equations 

. . . ,a;„) = 0 
f2{xi,...,Xn) = 0 

fs{xi,. . . ,Xn) =0 

is known as the affine variety defined by /i , • . . , /s , and is denoted by 
fs)- A subset V C is said to be an affine variety ii V = 
V(/i, . . . , /s) for some collection of polynomials fi G k[xi, . . . , Xn]- 

In later chapters we will also introduce projective varieties. For now, 
though, we will often say simply ‘Variety” for “affine variety.” For example, 
V(x^ + — 1) in is the cylinder pictured above. The picture was 

generated using the Maple command 

implicitplot3d(x"2+z'‘2-l,x=-2. .2,y=-2. .2,z=~2. .2, 
grid=[20,20,20]); 

The variety V(a;^ + y^^ {z — 1)^ — 4) in is the sphere of radius 2 
centered at (0,0, 1) (see Fig. 1.2). 

If there is more than one defining equation, the resulting variety can be 
considered as an intersection of other varieties. For example, the variety 
V(x^ + — 1, + 2/^ + (z — 1)^ — 4) is the curve of intersection of the 




Figure 1.2. Sphere 
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Figure 1.3. Cylinder-sphere intersection 



cylinder and the sphere pictured above. This is shown, from a viewpoint 
below the a^y-plane, in Fig. 1.3. 

The union of the sphere and the cylinder is also a variety, namely V((x^ + 
— l)(x^ -\-y^ — 1)^ — 4)). Generalizing examples like these, we have: 

Exercise 1. 

a. Show that any finite intersection of affine varieties is also an affine 
variety. 

b. Show that any finite union of affine varieties is also an affine variety. 
Hint: li V = V(/i, . . . ,fs) and W — V(^i, . . . , gt), then what is 

c. Show that any finite subset of n > 1, is an affine variety. 

On the other hand, consider the set S = R \ {0, 1, 2}, a subset of R. 
We claim S is not an affine variety. Indeed, if / is any polynomial in 
R[x] that vanishes at every point of S', then / has infinitely many roots. 
By standard properties of polynomials in one variable, this implies that 
/ must be the zero polynomial. (This is the one- variable case of the Zero 
Function property given above; it is easily proved in k[x] using the division 
algorithm.) Hence the smallest variety in R containing S is the whole real 
line itself. 

An affine variety V C can be described by many different sys- 
tems of equations. Note that if g = pi/i + P 2/2 + * * • + Ps/s? where 
Pi G fc[xi, . . . , Xn] are any polynomials, then p(ai, . . . , a^) = 0 at each 
(ai, . . . , On) G V(/i, . . . , /s). So given any set of equations defining a va- 
riety, we can always produce infinitely many additional polynomials that 
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also vanish on the variety. In the language of §1 of this chapter, the g as 
above are just the elements of the ideal (/i, . . . , fs). Some collections of 
these new polynomials can define the same variety as the /i, . . . , /g. 

Exercise 2. Consider the polynomial p from (1.2). In (1.4) we saw that 
p G {x‘^ — 1, x‘^ y‘^ -\- {z — 1)^ — 4). Show that 

{x^ -i- z^^ - 1, x^ {z - 1)^ - 4} = {x^^ + - 1, 2 /^ - 2;$; - 2) 

in Q[Xyy, z]. Deduce that 

V(x^ + - 1, + (z - 1)^ - 4) = V(x^ + 2 :^ - 1, - 2z - 2). 

Generalizing Exercise 2 above, it is easy to see that 

• (Equal Ideals Have Equal Varieties) If (/i, • • • , /«) = {9u • • • ^> 91 ) in 
fc[xi, . . . ,Xn], then V(/i, . . . ,/^) = V(yi, . . . ,yt). 

See [CLO], Chapter 1, §4. By this result, together with the Hilbert Basis 
Theorem from §1, it also makes sense to think of a variety as being defined 
by an ideal in fc[xi, . . . , Xn], rather than by a specific system of equations. 
If we want to think of a variety in this way, we will write V = V(/) where 
I C fc[xi, . . . , Xn] is the ideal under consideration. 

Now, given a variety V C we can also try to turn the construction of 
V from an ideal around, by considering the entire collection of polynomials 
that vanish at every point of V. 

(4.2) Definition. Let V C be a variety. We denote by I(V) the set 
{/ e k[xi, . . . , x„] : /(ai, . . . , a„) = 0 for all (ai, . . . , a„) G F}. 

We call 1{V) the ideal of V for the following reason. 

Exercise 3. Show that I(V') is an ideal in fc[xi, . . . , Xn] by verifying that 
the two properties in Definition (1.5) hold. 

If V = V(/), is it always true that 1(F) = /? The answer is no, as 
the following simple example demonstrates. Consider V = V(x^) in R^. 
The ideal I = (x^) in R[x, y] consists of all polynomials divisible by x^. 
These polynomials are certainly contained in 1(F), since the corresponding 
variety F consists of all points of the form (0, 6), 6 G R (the y-axis). Note 
that p(x, y) = X G 1(F), but x ^ /. In this case, I(V(/)) is strictly larger 
than I, 

Exercise 4. Show that the following inclusions are always valid: 

7 C ^/7 C I(V(/)), 

where \/7 is the radical of I from Definition (1.6). 
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It is also true that the properties of the field k influence the relation 
between I(V(/)) and I. For instance, over M, we have + 1) = 0 
and I(V(a;^ + 1)) = R[x]. On the other hand, if we take k = C, then 
every polynomial in C[a;] factors completely by the Fundamental Theorem 
of Algebra. We find that V(x^ + 1) consists of the two points ±i G C, and 
I(V(x2 + 1)) = {X^ + 1). 

Exercise 5. Verify the claims made in the preceding paragraph. You may 
want to start out by showing that if a G C, then I({a}) = {x — a). 

The first key relationships between ideals and varieties are summarized 
in the following theorems. 

• (Strong Nullstellensatz) If k is an algebraically closed field (such as C) 
and I is an ideal in k[x\^ . . . , then 

I(V(/)) = V7. 

• (Ideal- Variety Correspondence) Let k be an arbitrary field. The maps 

affine varieties ideals 

and 

ideals affine varieties 

are inclusion-reversing, and V(I(V)) = V for all affine varieties V. If k 
is algebraically closed, then 

affine varieties radical ideals 



and 

radical ideals affine varieties 

are inclusion-reversing bijections, and inverses of each other. 

See, for instance [CLO], Chapter 4, §2, or [AL], Chapter 2, §2. We con- 
sider how the operations on ideals introduced in §1 relate to operations on 
varieties in the following exercises. 



Additional Exercises for §4 

Exercise 6. In §1, we saw that the polynomial p = -f \ — z — 1 is 

in the ideal I = z"^ — x‘^ y^ -i- {z — 1)^ — 4) C M[x, y, z]. 

a. What does this fact imply about the varieties V(p) and V(7) in M^? 
(V(7) is the curve of intersection of the cylinder and the sphere pictured 
in the text.) 

b. Using a 3-dimensional graphing program (e.g. Maple’s implicitplotSd 
function from the plots package) or otherwise, generate a picture of the 
variety V(p). 
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c. Show that V(p) contains the variety W = — 1, — 2). Describe 

W geometrically. 

d. If we solve the equation 

^ y^z - z - 1 = 0 



for z, we obtain 
(4.3) 



z = 



— 1 

1 - 



The right-hand side r(x, y) of (4.3) is a quotient of polynomials or, in the 
terminology of §1, a rational function in x, y, and (4.3) is the equation 
of the graph of r(x, y). Exactly how does this graph relate to the variety 
Y{x^ + ^y^z — z — 1) in M^? (Are they the same? Is one a subset of 
the other? What is the domain of r(x, y) as a function from to R?) 



Exercise 7. Show that for any ideal I C fc[xi, . . . , \47/ = V?- Hence 

Vl is automatically a radical ideal. 



Exercise 8. Assume k is an algebraically closed field. Show that in 
the Ideal- Variety Correspondence, sums of ideals (see Exercise 11 of §1) 
correspond to intersections of the corresponding varieties: 

v(/ + j) = v(/) n v( j). 

Also show that if V and W are any varieties, 

i(v nw) = y/i{v) + i{W). 



Exercise 9. 

a. Show that the intersection of two radical ideals is also a radical ideal. 

b. Show that in the Ideal- Variety Correspondence above, intersections 
of ideals (see Exercise 12 from §1) correspond to unions of the 
corresponding varieties: 

V(J n J) = v(/) u v( J). 

Also show that if V and W are any varieties, 

i{v uw) = i{V) n i{w). 

c. Show that products of ideals (see Exercise 12 from §1) also correspond 
to unions of varieties: 

V(/J) = V(/)UV(J). 

Assuming k is algebraically closed, how is the product I(V)I(W’) related 
to 1{V U W)7 

Exercise 10. A variety V is said to be irreducible if in every expression 
of V as a union of other varieties, V = Vi U V 2 , either Vi = V or V 2 = V. 
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Show that an affine variety V is irreducible if and only if I(V) is a prime 

ideal (see Exercise 8 from §1). 

Exercise 11. 

a. Show by example that the set difference of two affine varieties: 

V\W = {peV:p^W} 

need not be an affine variety. Hint: For instance, let k be an infinite 
field, consider k[x], and let V = k — V(0) and W = {0} = V(x). 

b. Show that for any ideals /, J in fc[xi, . . . , Xn]? V(/:J) contains 
V(7) \ V(J), but that we may not have equality. (Here /: J is the 
quotient ideal introduced in Exercise 13 from §1.) 

c. If / is a radical ideal, show that any algebraic variety containing 
V(7) \ V( J) must contain V(7: J). Thus V(7: J) is the smallest variety 
containing the difference V(7) \ V(J); it is called the Zariski closure of 
V(7) \ V(J). See [CLO], Chapter 4, §4. 

d. Show that if 7 is a radical ideal and J is any ideal, then 7: J is also a 
radical ideal. Deduce that I(V’) : 1{W) is the radical ideal corresponding 
to the Zariski closure of V \ IF in the Ideal- Variety Correspondence. 




Chapter 2 

Solving Polynomial Equations 



In this chapter we will discuss several approaches to solving systems of 
polynomial equations. First, we will discuss a straightforward attack based 
on the elimination properties of lexicographic Grobner bases. Combining 
elimination with numerical root-finding for one- variable polynomials we get 
a conceptually simple method that generalizes the usual techniques used 
to solve systems of linear equations. However, there are potentially severe 
difficulties when this approach is implemented on a computer using finite- 
precision arithmetic. To circumvent these problems, we will develop some 
additional algebraic tools for root-finding based on the algebraic structure 
of the quotient rings fc[xi, . . . ^Xn]/I- Using these tools, we will present 
alternative numerical methods for approximating solutions of polynomial 
systems and consider methods for real root-counting and root-isolation. 
In Chapters 3, 4 and 7, we will also discuss polynomial equation solving. 
Specifically, Chapter 3 will use resultants to solve polynomial equations, 
and Chapter 4 will show how to assign a well-behaved multiplicity to each 
solution of a system. Chapter 7 will consider other numerical techniques 
(homotopy continuation methods) based on bounds for the total number 
of solutions of a system, counting multiplicities. 



§1 Solving Polynomial Systems by Elimination 

The main tools we need are the Elimination and Extension Theorems. For 
the convenience of the reader, we recall the key ideas: 

• (Elimination Ideals) If I is an ideal in fc[a:i, . . . , Xn], then the ith 
elimination ideal is 



le^ I nA:[x^+i,...,a;„]. 

Intuitively, if / = (/i, • • • , /s), then the elements of are the linear com- 
binations of the /i, . . . , /s, with polynomial coefficients, that eliminate 
xi, . . . , from the equations /i = • • • = /« = 0. 
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• (The Elimination Theorem) If G is a Grobner basis for I with respect 
to the lex order {x\ > X 2 > • • ‘ > Xn) (or any order where monomi- 
als involving at least one of xi, . . . , are greater than all monomials 
involving only the remaining variables), then 

G( = Gr\k[xi+i,...,Xn] 

is a Grobner basis of the ^th elimination ideal 1^. 

• (Partial Solutions) A point (a^+i, . . . , a^) G V(/^) C is called a 
partial solution. Any solution (ai, . . . , a^) G V(/) C k'^ truncates to 
a partial solution, but the converse may fail — not all partial solutions 
extend to solutions. This is where the Extension Theorem comes in. To 
prepare for the statement, note that each / in can be written as a 
polynomial in X£, whose coefficients are polynomials in x^^i, . . . , Xn-* 

f ~ (^q{X£^i^ • • • j ^n)^^ “f" * * • “1" Co(iC^-j-l, • • • j Xji). 

We call Cq the leading coefficient polynomial of / if x^ is the highest 
power of X£ appearing in /. 

• (The Extension Theorem) If k is algebraically closed (e.g., k = C), then 
a partial solution (a^+i, . . . , Un) in V(/^) extends to (a^, a^+i, . . . , a^) in 
V(/^_i) provided that the leading coefficient polynomials of the elements 
of a lex Grobner basis for do not all vanish at (a^+i, . . . , Un). 

For the proofs of these results and a discussion of their geometric meaning, 
see Chapter 3 of [CLO]. Also, the Elimination Theorem is discussed in §6.2 
of [BW] and §2.3 of [AL], and [AL] discusses the geometry of elimination 
in §2.5. 

The Elimination Theorem shows that a lex Grobner basis G successively 
eliminates more and more variables. This gives the following strategy for 
finding all solutions of the system: start with the polynomials in G with the 
fewest variables, solve them, and then try to extend these partial solutions 
to solutions of the whole system, applying the Extension Theorem one 
variable at a time. 

As the following example shows, this works especially nicely when V (/) 
is finite. Consider the system of equations 

x^ + y^ + - 4: 

(1.1) + 2y2 = 5 

xz = 1 

from Exercise 4 of Chapter 3, §1 of [CLO]. To solve these equations, we 
first compute a lex Grdbner basis for the ideal they generate using Maple: 

with (grobner) : 

PList := [x"2+y"2+z''2-4, x"2+2*y"2-5, x*z-l] ; 

VList := [x,y,z] ; 

G := gbasis (PList , VList ,plex) ; 
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This gives output 

G := [2z^ -3z + x,-l + y^ - z^, 1 + 2^^ - 3z% 

From the Grobner basis it follows that the set of solutions of this system in 
is finite (why?). To find all the solutions, note that the last polynomial 
depends only on z (it is a generator of the second elimination ideal I 2 = 
I n C[z\) and factors nicely in Q[z], To see this, we may use 

factor(2*z"4 - 3*z"2 + 1) ; 

which generates the output 

{z - 1)(2 + l)(2z^ - 1). 

Thus we have four possible z values to consider: 

2 = ±1,±1/V^. 

By the Elimination Theorem, the first elimination ideal /i = 7 fl C[y, z] is 
generated by 

2 2 1 
2 / - ^ - 1 

2z^ - 3z^ + 1. 

Since the coefficient of in the first polynomial is a nonzero constant, 
every partial solution in V(/ 2 ) extends to a solution in V(/i). There are 
eight such points in all. To find them, we substitute a root of the last 
equation for z and solve the resulting equation for y. For instance, 

subs(z=l,G) ; 



will produce: 

[-1 + x,j/^ - 2,0], 

SO in particular, y = ±\/2. In addition, since the coefficient of x in the first 
polynomial in the Grobner basis is a nonzero constant, we can extend each 
partial solution in V(/i) (uniquely) to a point of V(J). For this value of z, 
we have x = \. 

Exercise 1. Carry out the same process for the other values of z as well. 
You should find that the eight points 

(1,±\/2,1), (-l,±v^,-l), (A ±\/6/2, l/\/2), (-v^,±\/6/2,-l/\/2) 
form the set of solutions. 

The system in (1.1) is relatively simple because the coordinates of the 
solutions can all be expressed in terms of square roots of rational numbers. 
Unfortunately, general systems of polynomial equations are rarely this nice. 
For instance it is known that there are no general formulas involving only 
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the field operations in k and extraction of roots (i.e., radicals) for solving 
single variable polynomial equations of degree 5 and higher. This is a fa- 
mous result of Rufiini, Abel, and Galois (see [Her]). Thus, if elimination 
leads to a one-variable equation of degree 5 or higher, then we may not be 
able to give radical formulas for the roots of that polynomial. 

We take the system of equations given in (1.1) and change the first term 
in the first polynomial from to x^. Then executing 

PList2 := [x"5+y"2+z"2-4, x"2+2*y"2-5, x*z~l] ; 

VList2 := [x,y,z] ; 

G2 := gbasisCPList ,VList ,plex) ; 

produces the following lex Grobner basis: 

G 2 :=[2a; + 2z® - 2 ^ - 3z^, -10 + ^ + 3z^ - 2z^ + Ay'^, 

2- z^ -3z^ + 2z'^]. 

In this case, the command 

factor(2*z"7 - 3*z"5 - z"3 + 2) ; 
gives the factorization 

2z'^ -3z^ -z^ + 2 = {z- 1 )( 22 :® + 2z® - z'^ - z^ - 2z'^ - 2z - 2), 

and the second factor is irreducible in Q[z]. In a situation like this, to 
go farther in equation solving, we need to decide what kind of answer is 
required. 

If we want a purely algebraic, “structural” description of the solutions, 
then Maple can represent solutions of systems like this via the solve 
command. Let’s see what this looks like. Entering 

solve (convert (G2, set) ,{x,y,z}) ; 
you should generate the following output: 

{z = l,y = RootOf(— 2 -h -Z^), X = 1}, 

{z = %1, 2 / = i RootOf(-10 + %1 + 3%13 - 2%1^ + .Z^), 
a; = -l%l2(2%l4_l-3%l2)} 

%1 := RootOf(2_Z® -h 2_Z^ - - 2_Z^ - 2_Z - 2) 

Here the %1 is an abbreviation for a subexpression that occurs several 
times. It stands for any one root of the polynomial 2_Z® -f 2_Z^ — _Z^ — 
_Z^ — 2_Z^ — 2_Z — 2. Similarly, the other RootOf expressions that appear 
in the solutions stand for any solution of the corresponding equation in the 
dummy variable _Z. 

Exercise 2. Verify that the expressions above are obtained if we solve for 
z from the Grobner basis G 2 and then use the Extension Theorem. How 
many solutions are there of this system in in C^? 
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On the other hand, in many practical situations where equations must 
be solved, knowing a numerical approximation to a real or complex solu- 
tion is often more useful, and perfectly acceptable provided the results are 
sufficiently accurate. In our particular case, one possible approach would 
be to use a numerical root-finding method to find approximate solutions of 
the one-variable equation 

(1.3) 2z^ + 2z^ - z"^ - z^ - 2z^ - 2z - 2 = 0, 

and then proceed as before using the Extension Theorem, except that we 
now use fioating point arithmetic in all calculations. In some examples, 
numerical methods will also be needed to solve for the other variables as 
we extend. 

One well-known numerical method for solving one-variable polynomial 
equations in R or C is the Newton- Raphson method or, more simply but 
less accurately, Newton’s method. This method may also be used for equa- 
tions involving functions other than polynomials, although we will not 
discuss those here. For motivation and a discussion of the theory behind 
the method, see [BuF] or [Act]. 

The Newton-Raphson method works as follows. Choosing some initial 
approximation zq to a root of p{z) = 0, we construct a sequence of numbers 
by the rule 

- 4^ for fc = 0, 1, 2, . . . , 

P'{Zk) 

where p^{z) is the usual derivative of p from calculus. In most situations, 
the sequence Zk will converge rapidly to a solution z of p{z) = 0, that is, 
z = limfc_,oo will be a root. Stopping this procedure after a finite number 
of steps (as we must!), we obtain an approximation to z. For example we 
might stop when Zk+i and Zk agree to some desired accuracy, or when a 
maximum allowed number of terms of the sequence have been computed. 
See [BuF], [Act], or the comments at the end of this section for additional 
information on the performance of this technique. When trying to find all 
roots of a polynomial, the trickiest part of the Newton-Raphson method is 
making appropriate choices of zq. It is easy to find the same root repeatedly 
and to miss other ones if you don’t know where to look! 

Fortunately, there are elementary bounds on the absolute values of the 
roots (real or complex) of a polynomial p{z). Here is one of the simpler 
bounds. 

Exercise 3. Show that if p{z) = z'^ an-iz'^~^ -h • • • + ao is a monic 

polynomial with complex coefficients, then all roots z of p satisfy \z\ < J5, 
where 



B — max{l, |un-il + * * • + |ui| + luol}. 
Hint: The triangle inequality implies that \a b\ > [aj — |6|. 
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See Exercise 11 below for another better bound on the roots. Given any 
bound of this sort, we can limit our attention to zq in this region of the 
complex plane to search for roots of the polynomial. 

Instead of discussing searching strategies for finding roots, we will use a 
built-in Maple function to approximate the roots of the system from (1.2). 
The Maple function f solve finds numerical approximations to all real (or 
complex) roots of a polynomial by a combination of root location and 
numerical techniques like Newton-Raphson. For instance, the command 

f solve (2*z"6+2*z''5-z"4-z'"3“2*z"2“2*z“2) ; 

will compute approximate values for the real roots of our polynomial (1.3). 
The output should be: 

-1.395052015, 1.204042437. 

(Note: In Maple, 10 digits are carried by default in decimal calculations; 
more digits can be used by changing the value of the Maple system variable 
Digits. Also, the actual digits in your output may vary slightly if you 
carry out this computation using another computer algebra system.) To 
get approximate values for the complex roots as well, try: 

f solve (2*z''6+2*z"5~z''4-z"3-2*z"2-2*z“2 , complex) ; 

We illustrate the Extension Step in this case using the approximate value 

z = 1.204042437. 

We substitute this value into the Grobner basis polynomials using 
subs (z=l . 204042437 , G2) ; 

and obtain 

[2x - 1.661071025, -8.620421528 + 4y^, -.2 * 10“®]. 

Note that the value of the last polynomial was not exactly zero at our 
approximate value of z. Nevertheless, as in Exercise 1, we can extend this 
approximate partial solution to two approximate solutions of the system: 

(x, 2 /,z) = (.8305355125, ±1.468027718, 1.204042437). 

Checking one of these by substituting into the equations from (1.2), using 

subs (z=l . 204042437 , y=l . 468027718 , x= . 8305355125 , G2) ; 

we find 



[0, -.4 * 10“®, -.2 * 10“®], 

SO we have a reasonably good approximate solution, in the sense that our 
computed solution gives values very close to zero in the polynomials of the 
system. 
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Exercise 4. Find approximate values for all other real solutions of this 
system by the same method. 

In considering what we did here, one potential pitfall of this approach 
should be apparent. Namely, since our solutions of the one- variable equation 
are only approximate, when we substitute and try to extend, the remaining 
polynomials to be solved for x and y are themselves only approximate. Once 
we substitute approximate values for one of the variables, we are in effect 
solving a system of equations that is different from the one we started 
with, and there is little guarantee that the solutions of this new system are 
close to the solutions of the original one. Accumulated errors after several 
approximation and extension steps can build up quite rapidly in systems 
in larger numbers of variables, and the effect can be particularly severe if 
equations of high degree are present. 

To illustrate how bad things can get, we consider a famous cautionary 
example due to Wilkinson, which shows how much the roots of a polynomial 
can be changed by very small changes in the coefficients. 

Wilkinson’s example involves the following polynomial of degree 20: 

p{x) = (x -h l){x + 2) • • • (a; + 20) = x^^ -f 210a;^^ + • • • + 20!. 

The roots are the 20 integers x = —1, —2, . . . , —20. Suppose now that we 
“perturb” just the coefficient of x^^, adding a very small number. We carry 
20 decimal digits in all calculations. First we construct p{x) itself: 

Digits := 20: 

p := 1: 

for k to 20 do p := p*(x+k) od: 

Printing expand (p) out at this point will show a polynomial with some 
large coefficients indeed! But the polynomial we want is actually this: 

q := expand (p + .000000001*x"19) : 
f solve (q,x, complex) ; 

The approximate roots oiq = p-h .000000001 x^^ (truncated for simplicity) 
are: 

- 20.03899, -18.66983 - .35064 /, -18.66983 + .35064 /, 

- 16.57173 - .88331 /, -16.57173 + .88331 /, 

- 14.37367 - .77316 /, -14.37367 + .77316 /, 

- 12.38349 - .10866 /, -12.38349 + .10866 /, 

- 10.95660, -10.00771, -8.99916, -8.00005, 

- 6.999997, -6.000000, -4.99999, -4.00000, 

- 2.999999, -2.000000, -1.00000. 
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Instead of 20 real roots, the new polynomial has 12 real roots and 4 com- 
plex conjugate pairs of roots. Note that the imaginary parts are not even 
especially small! 

While this example is admittedly pathological, it indicates that we should 
use care in finding roots of polynomials whose coefiicients are only approx- 
imately determined. (The reason for the surprisingly bad behavior of this p 
is essentially the equal spacing of the roots! We refer the interested reader 
to Wilkinson’s paper [Wil] for a full discussion.) 

Along the same lines, even if nothing this spectacularly bad happens, 
when we take the approximate roots of a one variable polynomial and try 
to extend to solutions of a system, the results of a numerical calculation can 
still be unreliable. Here is a simple example illustrating another situation 
that causes special problems. 

Exercise 5. Verify that if x > y, then 

G = [x^ + 2x + 3 + y® - j/, y® - j/^ + 2y] 
is a lex Grobner basis for the ideal that G generates in R[x, y]. 

We want to find all real points (x, y) G V(G). Begin with the equation 

y® - y^ + 2y = 0, 

which has exactly two real roots. One is y = 0, and the second is in the 
interval [—2, —1] because the polynomial changes sign on that interval. 
Hence there must be a root there by the Intermediate Value Theorem from 
calculus. Using f solve to find an approximate value, we find the nonzero 
root is 

(1.4) -1.267168305 

to 10 decimal digits. Substituting this approximate value for y into G yields 
[x^ + 2x + .999999995, .7 * 10"®]. 

Then use 

f solve (x"2 + 2*x + .999999995) ; 

to obtain 

-1.000070711, -.9999292893. 

Clearly these are both close to x = —1, but they are different. Taken 
uncritically, this would seem to indicate two distinct real values of x when 
y is given by (1.4). 

Now, suppose we used an approximate value for y with fewer decimal 
digits, say y = —1.2671683. Substituting this value for y gives us the 
quadratic 



x^ + 2x + 1.000000054. 
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This polynomial has no real roots at all. Indeed, using the complex option 
in f solve, we obtain two complex values for x\ 

-1. - .0002323790008 /, -1. + .0002323790008 I. 

To see what is really happening, note that the nonzero real root of — 
2 /^ -f 2?/ = 0 satisfies — y + 2 = 0. When the exact root is substituted 
into G, we get 

\x^ -1- 2x + 1,0] 

and the resulting equation has a double root x — 

The conclusion to be drawn from this example is that equations with 
double roots, such as the exact equation 

+ 2a; + 1 = 0 

we got above, are especially vulnerable to the errors introduced by numer- 
ical root-finding. It can be very difficult to tell the difference between a 
pair of real roots that are close, a real double root, and a pair of complex 
conjugate roots. 

From these examples, it should be clear that finding solutions of polyno- 
mial systems is a delicate task in general, especially if we ask for information 
about how many real solutions there are. For this reason, numerical meth- 
ods, for all their undeniable usefulness, are not the whole story. And they 
should never be applied blindly. The more information we have about the 
structure of the set of solutions of a polynomial system, the better a chance 
we have to determine those solutions accurately. For this reason, in §2 and 
§3 we will go to the algebraic setting of the quotient ring k[x \, . . . , Xn]/I 
to obtain some additional tools for this problem. We will apply those tools 
in §4 and §5 to give better methods for finding solutions. 

For completeness, we conclude with a few additional words about the 
numerical methods for equation solving that we have used. First, if ^ is a 
multiple root ofp{z) = 0, then the convergence of the Newton- Raphson se- 
quence Zk can be quite slow, and a large number of steps and high precision 
may be required to get really close to a root (though we give a method for 
avoiding this difficulty in Exercise 9). Second, there are some choices of zq 
where the sequence Zk will fail to converge to a root of p{z). See Exercise 
10 below for some simple examples. Finally, the location of z in relation to 
Zq can be somewhat unpredictable. There could be other roots lying closer 
to Zq. These last two problems are related to the fractal pictures associated 
to the Newton-Raphson method over C — see, for example, [PR]. We should 
also mention that there are multivariable versions of Newton-Raphson for 
systems of equations and other iterative methods that do not depend on 
elimination. These have been much studied in numerical analysis. For more 
details on these and other numerical root-finding methods, see [BuF] and 
[Act]. Also, we will discuss homotopy continuation methods in Chapter 7, 
§5 of this book. 
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Additional Exercises for §1 

Exercise 6. Use elimination to solve the system 

0 = x^ + 2y^ -y-2z 
0 = + lOz - 1 

0 = — 7yz. 

How many solutions are there in R®; how many are there in C^? 

Exercise 7. Use elimination to solve the system 

0 = — 2x 

0 = x^ — yz — X 
0 = X - y 2z. 

How many solutions are there in how many are there in C^? 

Exercise 8. In this exercise we will study exactly why the performance 
of the Newton-Raphson method is poor for multiple roots, and suggest a 
remedy. Newton-Raphson iteration for any equation p{z) = 0 is an example 
of fixed point iteration, in which a starting value zq is chosen and a sequence 

(1.5) Zfc+i = g{zk) for A; = 0, 1, 2, . . . 

is constructed by iteration of a fixed function g{z). For Newton-Raphson 
iteration, the function g{z) is g{z) = Np{z) = z — p{z)/p'{z). If the se- 
quence produced by (1.5) converges to some limit ~z, then ^ is a fixed point 
of g (that is, a solution of g{z) = 2:). It is a standard result from analysis 
(a special case of the Contraction Mapping Theorem) that iteration as in 
(1.5) will converge to a fixed point z of g provided that \g' (z)\ < 1, and Zq 
is chosen sufficiently close to z. Moreover, the smaller \g'{z)\ is, the faster 
convergence will be. The case g\z) = 0 is especially favorable. 

a. Show that each simple root of the polynomial equation p{z) = 0 is a 
fixed point of the rational function Np{z) — z — p{z)/p\z), 

b. Show that multiple roots of p{z) = 0 are removable singularities of 
Np{z) (that is, \Np{z)\ is bounded in a neighborhood of each multiple 
root). How should Np be defined at a multiple root of p{z) = 0 to make 
Np continuous at those points? 

c. Show that Np{z) = 0 if 2 is a simple root of p{z) = 0 (that is, if 
p{z) = 0, but p^{z) ^ 0). 

d. On the other hand, show that if z is a root multiplicity k of p{z) (that 
is, if p{z) = p'{z) = • • • = p^^~^\z) = 0 but p^^\z) ^ 0), then 

limN^{z) = 1 - 7 . 
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Thus Newton-Raphson iteration converges much faster to a simple 
root of p(z) = 0 than it does to a multiple root, and the larger the 
multiplicity, the slower the convergence, 
e. Show that replacing p(z) by 

PreH(Z) qcD(p(z),P'(z)) 

(see [CLO], Chapter 1, §5, Exercises 14 and 15) eliminates this difficulty, 
in the sense that the roots of Pred{z) = 0 are all simple roots. 

Exercise 9. There are cases when the Newton-Raphson method fails to 
find a root of a polynomial for lots of starting points 2 : 0 - 

a. What happens if the Newton-Raphson method is applied to solve the 
equation 2 ;^ + 1 = 0 starting from a real zqI What happens if you take 
zq with nonzero imaginary parts? Note: It can be shown that Newton- 
Raphson iteration for the equation p{z) = 0 is chaotic if Zq is chosen in 
the Julia set of the rational function Np{z) = z — p{z)jp\z) (see [PR]), 
and exact arithmetic is employed. 

b. Let p{z) = z"^ — z^ — 11/36 and, as above, let Np{z) = z — p{z)fp'{z). 
Show that ±l/^/6 satisfies Np{l/\/6) = — 1/\/6, Np{—l/y/E) = l/\/6, 
and Np{l/y/6) = 0. In the language of dynamical systems, ±l/\/6 is 
a superattracting 2- cycle for Np{z). One consequence is that for any zq 
close to ±l/\/6, the Newton-Raphson method will not locate a root of 
p. This example is taken from Chapter 13 of [Dev]. 

Exercise 10. This exercise improves the bound on roots of a polynomial 

given in Exercise 3. Let p{z) = z'^ + an-iz'^~^ H h + uq be a monic 

polynomial in C[z]. Show that all roots z of p satisfy \z\ < B, where 

R = 1 + max{|an-il, . . . , |ai|, \ao\}. 

This upper bound can be much smaller than the one given in Exercise 3. 
Hint: Use the Hint from Exercise 3, and consider the evaluation of p{z) by 
nested multiplication: 

p(z) = (. . • ((z -f an-i)z + an- 2 )z H h ai)z + ao. 



§2 Finite-Dimensional Algebras 

This section will explore the “remainder arithmetic” associated to a 
Grobner basis G = {^i, . . . , gt} of an ideal I C k[x\, . . . , Xn]^ Recall from 
Chapter 1 that if we divide / G k[xi , . . . , Xn] by G, the division algorithm 
yields an expression 
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where the remainder / is a linear combination of the monomials ^ 

(lt(/)). Furthermore, since G is a Grobner basis, we know that / G / if 
^ 

and only if / =0, and the remainder is uniquely determined for all /. 

This implies 

(2.2) ~f ^ f -gel. 

Since polynomials can be added and multiplied, given /, ^ G fc[xi, . . . , Xn] 
it is natural to ask how the remainders oi f g and fg can be computed 
if we know the remainders of /, g themselves. The following observations 
show how this can be done. 

• The sum of two remainders is again a remainder, and in fact one can 

easily show that / + ^ . 

• On the other hand, the product of remainders need not be a remain- 

— Q G ^ G 

der. But it is also easy to see that / • — f9 ^ and / -g^ is a 

remainder. 

We can also interpret these observations as saying that the set of remain- 
ders on division by G has naturally defined addition and multiplication 
operations which produce remainders as their results. 

This “remainder arithmetic” is closely related to the quotient ring 
k[xi ^ . . . , Xn]/I- We will assume the reader is familiar with quotient rings, 
as described in Chapter 5 of [CLO] or in a course on abstract algebra. 
Recall how this works: given / G k[xi ^ . . . , we have the coset 

[f] = = + 

and the crucial property of cosets is 

(2.3) [f] = [g]^ f -ge I. 

The quotient ring k[xi, . . . , Xn]/ 1 consists of all cosets [/] for / € 
fc[xi, . . .,Xn\. 

^ 

Prom (2.1), we see that / G [/], and then (2.2) and (2.3) show that we 
have a one-to-one correspondence 

remainders ^ ^ cosets 

T — [/]• 

^ 

Thus we can think of the remainder / as a standard representative of its 
coset [/] G fc[xi, . . . , Xn]//. Furthermore, it follows easily that remainder 
arithmetic is exactly the arithmetic in fc[xi, . . . , Xn]//. That is, under the 
above correspondence we have 

7^ • ^ ^ — *[f]- [g]- 
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Since we can add elements of fc[xi, . . . , Xn]/ 1 and multiply by constants 
(the cosets [c] for c G fc), k[x\^ . . . , Xn]/I also has the structure of a vector 
space over the field k. A ring that is also a vector space in this fashion 
is called an algebra. The algebra k[xi, . . . , Xn]/I will be denoted by A 
throughout the rest of this section, which will focus on its vector space 
structure. 

An important observation is that remainders are the linear combinations 
of the monomials x^ ^ (lt(/)) in this vector space structure. (Strictly 
speaking, we should use cosets, but in much of this section will identify 
a remainder with its coset in A.) Since this set of monomials is linearly 
independent in A (why?), it can be regarded as a basis of A. In other 
words, the monomials 

B = {x^ : x" ^ (lt(/))} 

form a basis of A (more precisely, their cosets are a basis). We will refer to 
elements of B as basis monomials. In the literature, basis monomials are 
often called standard monomials. 

The following example illustrates how to compute in A using basis 
monomials. Let 

(2.4) G = {x^ + Zxy/2 + y^/2 - 3x/2 - 3y/2, xy"^ - x,y^ - y}. 

Using the grevlex order with x > y, it is easy to verify that G is a Grobner 
basis for the ideal I = (G) C C[x, y] generated by G. By examining the 
leading monomials of G, we see that (lt(/)) = (x^,xy^,y^). The only 
monomials not lying in this ideal are those in 

B = {l,x,y,xy,y^} 

SO that by the above observation, these five monomials form a vector space 
basis for A = C[x, y]/I over C. 

We now turn to the structure of the quotient ring A. The addition op- 
eration in A can be viewed as an ordinary vector sum operation once we 
express elements of A in terms of the basis B in (2.4). Hence we will consider 
the addition operation to be completely understood. 

Perhaps the most natural way to describe the multiplication operation 
in A is to give a table of the remainders of all products of pairs of elements 
from the basis B. Since multiplication in A distributes over addition, this 
information will suffice to determine the products of all pairs of elements 
of A. 

For example, the remainder of the product x • xy may be be computed 
as follows using Maple. Using the Grobner basis G, we compute 

normalf (x"2*y,G, [x,y] ,tdeg) ; 

3 3 3 2 1 

-xy- -x-^ -y - -y. 



and obtain 
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Exercise 1. By computing all such products, verify that the multiplication 
table for the elements of the basis B is: 



(2.5) 
where 

a = —^xyl2 — 2/^/2 + 3x/2 -f 32//2 
p = Zxy/2 + 3j/V 2 - 3a:/2 - y/2. 

This example was especially nice because A was finite-dimensional as a 
vector space over C. In general, for any field fc C C, we have the following 
basic theorem which describes when k[xi ^ . . . , Xn]/I is finite dimensional. 

• (Finiteness Theorem) Let A; C C be a field, and let I C k[xi^ . . . , Xn] be 

an ideal. Then the following conditions are equivalent: 

a. The algebra A = fc[xi, . . . , Xn]/! is finite-dimensional over fc. 

b. The variety V(/) C is a finite set. 

c. If G is a Grobner basis for /, then for each i, 1 < i < n, there is an 

> 0 such that x^^ = lt(^) for some g G G. 

For a proof of this result, see Theorem 6 of Chapter 5, §3 of [CLO], Theorem 
2.2.7 of [AL], or Theorem 6.54 of [BW]. An ideal satisfying any of the above 
conditions is said to be zero- dimensional. Thus 

A is a finite-dimensional algebra <==> / is a zero-dimensional ideal. 

A nice consequence of this theorem is that I is zero-dimensional if and 
only if there is a nonzero polynomial in / fl k[xj\ for each i = 1, . . . , n. To 
see why this is true, first suppose that I is zero-dimensional, and let G be a 
reduced Grobner basis for any lex order with Xi as the “last” variable (i.e., 
Xj > Xi for j ^ i). By item c above, there is some g £ G with lt(^) = x'^^ . 
Since we’re using a lex order with Xi last, this implies g G k[xi] and hence 
g is the desired nonzero polynomial. Note that g generates I fl k[xi] by the 
Elimination Theorem. 

Going the other way, suppose I r\k[xi] is nonzero for each i, and let mi be 
the degree of the unique monic generator of / fl k[xi\ (remember that k[xi] 
is a principal ideal domain — see Corollary 4 of Chapter 1, §5 of [CLO]). 
Then x^^ G (lt(/)) for any monomial order, so that all monomials not in 
(lt(/)) will contain xi to a power strictly less than mi. In other words, the 
exponents a of the monomials x^ ^ (lt(I)) will all lie in the “rectangular 
box” 

E = {a e Z>o : for each i, 0 < ai < rrii — 1}. 
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This is a finite set of monomials, which proves that A is finite-dimensional 
over k. 

Given a zero-dimensional ideal /, it is now easy to describe an algorithm 
for finding the set B of all monomials not in (lt(/)). Namely, no matter 
what monomial order we are using, the exponents of the monomials in 
B will lie in the box R described above. For each a e R, we know that 
^ (lt(/)) if and only if = x^. Thus we can list the a € Rin some 
systematic way and compute 5“ for each one. A vector space basis of A 
is given by the set of monomials 

B — {x^ : a e R and = x“}. 

See Exercise 13 below for a Maple procedure implementing this method. 

The vector space structure on A = k[xi, . . . ,Xn]/I for a zero- 
dimensional ideal I can be used in several important ways. To begin, let 
us consider the problem of finding the monic generators of the elimina- 
tion ideals I fl k[xi\. As indicated above, we could find these polynomials 
by computing several different lex Grobner bases, reordering the variables 
each time to place Xi last. This is an extremely inefficient method, however. 
Instead, let us consider the set of non-negative powers of [xi] in A: 

S = {1, 

Since A is finite-dimensional as a vector space over the field fc, S must 
be linearly dependent in A. Let be the smallest positive integer for 
which {1, [xi], [xi]^^, . . . , [xi]'^^} is linearly dependent. Then there is a linear 
combination 

= [ 0 ] 

j=o 

in A in which Cj G k are not all zero. In particular, Cjm ^ 0 since mi is 
minimal. By the definition of the quotient ring, this is equivalent to saying 
that 

nrii 

(2.6) pi{xi) == Y^ Cjxl G I. 

j=o 

Exercise 2. Verify that Pi{xi) as in (2.6) is a generator of the ideal 
/ n k[xi], and develop an algorithm based on this fact to find the monic 
generator of / fl k[xi], given any Grobner basis G for a zero-dimensional 
ideal I as input. 

The algorithm suggested in Exercise 2 often requires far less computa- 
tional eflFort than a lex Grobner basis calculation. Any ordering (e.g. grevlex) 
can be used to determine G, then only standard linear algebra (matrix op- 
erations) are needed to determine whether the set {1, [x^], [xi]^, . . . , [xi]'^} 
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is linearly dependent. We note that the finduni function from Maple’s 
grobner package is an implementation of this method. 

We will next discuss how to find the radical of a zero-dimensional ideal 
(see Chapter 1 for the definition of radical). To motivate what we will 
do, recall from §1 how multiple roots of a polynomial can cause problems 
when trying to find roots numerically. When dealing with a one-variable 
polynomial p with coefficients lying in a subfield of C, it is easy to see that 
the polynomial 

= ^ 

Pred GCD(p,p') 

has the same roots as p, but all with multiplicity one (for a proof of this, see 
Exercises 14 and 15 of Chapter 1, §5 of [CLO]). We call Pred the square-free 
part of p. 

The radical y/1 of an ideal I generalizes the idea of the square-free part 
of a polynomial. In fact, we have the following elementary exercise. 

Exercise 3. If p G k[x] is a nonzero polynomial, show that \/(^ = {Pred)- 

Since k[x] is a PID, this solves the problem of finding radicals for all 
ideals in k[x]. For a general ideal I C k[xi, . . . ,Xn], it is more diflBcult 
to find \/J, though algorithms are known and have been implemented 
in Axiom, Macaulay^ REDUCE, and Singular. Fortunately, when I is 
zero-dimensional, computing the radical is much easier, as shown by the 
following proposition. 

(2.7) Proposition. Let I C C[xi, . . . ,Xn] be a zero- dimensional ideal. 
For each i = 1, . . . , n, let Pi he the unique monic generator of I f] C[xi], 
and let Pi^red be the square-free part of pi. Then 

{pi^redi • • • iPn,red)’ 

Proof. Write J = I (pi,red? • • • jPn,red)- We first prove that J is a 
radical ideal, i.e., that J = vJ. For each i, using the fact that C is alge- 
braically closed, we can factor each pi^red to obtain Pi^red = {xi — (iii){xi — 
o>i 2 ) — ‘ {xi — aidi)j where the are distinct. Then 

J — J (Pl,red) “ 

j 

where the first equality holds since pi,red ^ J and the second follows from 
Exercise 9 below since pi, red has distinct roots. Now use p 2 , red to decompose 
each J + (xi — oij) in the same way. This gives 



J = P)('^ + (^1 “ *2 - 0.2k))- 

j,k 
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If we do this for alH = 1, 2, . . . , n, we get the expression 
J — («/ -f- (Xi Q-lji 5 • • • 1 

Since {xi — aij^^ . . , ,Xn — CLnu) is a maximal ideal, the ideal J -\- {x\ — 
aij^ , . . . , a^n — CLfijn) is either {xi — aij^ , ,Xn — CLnjn) whole ring 

C[xi, . . . , Xn]. It follows that J is a finite intersection of maximal ideals. 
Since a maximal ideal is radical and an intersection of radical ideals is 
radical, we conclude that J is a radical ideal. 

Now we can prove that J = VT. The inclusion J C J is built into 
the definition of J, and the inclusion J C Vl follows from the Strong 
Nullstellensatz, since the square-free parts of the pi vanish at all the points 
of V(7). Hence we have 

/ C J C \/7. 

Taking radicals in this chain of inclusions shows that y/j = y/1. But J is 
radical, so y/j = J and we are done. □ 

A Maple procedure that implements an algorithm for the radical of a 
zero-dimensional ideal based on Proposition (2.7) is discussed in Exercise 
16 below. It is perhaps worth noting that even though we have proved 
Proposition (2.7) using the properties of C, the actual computation of 
the polynomials Pi^red will involve only rational arithmetic when the input 
polynomials are in Q[xi, . . . , Xn]. 

For example, consider the ideal 

(2.8) I = (y^x + 3x^ - - 3x^, x^y - 2x^, 2y^x - x^ - 2y^ 4- x^) 

Exercise 4. Using Exercise 2 above, show that 

/ n Q[x] = (x^ — x^) 



and 



/nQ[j/] = {y^-2y^). 



Writing pi(x) = x^ — x^ and P 2 {y) = — 2y^, we can compute the 

square-free parts in Maple as follows. The command 

plred := simplify(pl/gcd(pl,diff (pl,x))) ; 

will produce 

Pl,red{x) = X(X - 1). 

Similarly, 



P 2 ,red(y) = y(y - 2). 
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Hence by Proposition (2.7), Vl is the ideal 

{y^x + -y^ - x^y - 2x^, 2y^x - x^ - x{x - 1), y{y - 2)). 

We note that Proposition (2.7) yields a basis, but usually not a Grohner 
basis, for Vl, 



Exercise 5. How do the dimensions of the vector spaces C[x,y]/I and 
C[a:, y]/Vl compare in this example? How could you determine the number 
of distinct points in V(/)? (There are two.) 



We will conclude this section with a very important result relating the 
dimension of A and the number of points in the variety V(7), or what is 
the same, the number of solutions of the equations /i = • • • = /s = 0 in 
To prepare for this we will need the following lemma. 



(2.9) Lemma. Let S = {pi, . . . ,Pm} be a finite subset ofC^. There exist 
polynomials gi G C[xi, . . . , Xn], i = 1, . . . such that 




0 ifi^ jj and 

1 ifi=j. 



For instance, if pi = {an,...,ain) and the first coordinates an are 
distinct, then we can take 



9i = 9i{xi) 






as in the Lagrange interpolation formula. In any case, a collection of poly- 
nomials gi with the desired properties can be found in a similar fashion. We 
leave the proof to the reader as Exercise 11 below. The following theorem 
ties all of the results of this section together, showing how the dimension 
of the algebra A for a zero-dimensional ideal gives a bound on the number 
of points in V(7), and also how radical ideals are special in this regard. 



(2.10) Theorem. Let I be a zero- dimensional ideal in C[xi, . . . , Xn], and 
let A = C[xi, . . . ,Xn]/T Then dime (4) is greater than or equal to the 
number of points in V(7). Moreover, equality occurs if and only if I is a 
radical ideal. 

Proof. Let 7 be a zero-dimensional ideal. By the Finiteness Theorem, 
V(7) is a finite set in C’^, say V(7) = {pi, . . . ,Pm}- Consider the mapping 

ip:C[xi,...,Xn]/I 

given by evaluating a coset at the points of V(7). In Exercise 12 below, 
you will show that (p is a well-defined linear map. 
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To prove the first statement in the theorem, it sufEces to show that ip 
is onto. Let . . . , be a collection of polynomials as in Lemma (2.9). 
Given an arbitrary (Ai, . . . , Am) ^ let / = ^i9i- com- 

putation shows that (p{[f]) = (Ai,...,Am). Thus (p is onto, and hence 
dim(A) > m. 

Next, suppose that I is radical. If [/] G ker((^), then f{pi) = 0 for all 
z, so that by the Strong Nullstellensatz, / G I(V(/)) = y/l = I. Thus 
[/] = [0], which shows that ip is one-to-one as well as onto. Then ip is an 
isomorphism, which proves that dim(i4) = m if / is radical. 

Conversely, if dim(A) = m, then ip is an isomorphism since it is an 
onto linear map between vector spaces of the same dimension. Hence ip is 
one-to-one. We can use this to prove that I is radical as follows. Since the 
inclusion I C >/7 always holds, it suffices to consider / G y/1 — I(V(/)) 
and show that / G 7. If / G V?, then f{pi) — 0 for all z, which implies 
ip([f]) = (0, . . . , 0). Since ip is one-to-one, we conclude that [/] = [0], or in 
other words that / G /, as desired. □ 

In Chapter 4, we will see that in the case I is not radical, there are 
well-defined multiplicities at each point in V(/) so that the sum of the 
multiplicities equals dim(yl). 



Additional Exercises for §2 

Exercise 6. Using the grevlex order, construct the monomial basis B for 
the quotient algebra A — C[a;, y]/7, where 7 is the ideal from (2.8) and 
construct the multiplication table for B la A, 

Exercise 7. In this exercise, we will explain how the ideal 7 = + 

^xy/2 -h 2/^/2 — 3a;/2 — 3y/2, xy^ — x, — y) from (2.4) was constructed. 
The basic idea was to start from a finite set of points and construct a 
system of equations, rather than the reverse. 

To begin, consider the maximal ideals 

h = {x,y), h = {x - l,y - 1 ), 

73 = (x + 1, y - 1), 74 = (x - 1, y -f 1), 

h ^{x - 2,y 4- 1) 

in C[x,y]. Each variety V(7^) is a single point in C^, indeed in C 
C^. The union of the five points forms an affine variety U, and by the 
algebra-geometry dictionary from Chapter 1, U = V(7i fl 72 fl • • • fl 7s). 

An algorithm for intersecting ideals is described in Chapter 1. Use it 
to compute the intersection 7 = 7i D 72 H • • • fl 7s and find the reduced 
Grobner basis for 7 with respect to the grevlex order (x > y). Your result 
should be the Grobner basis given in (2.4). 
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Exercise 8. 

a. Use the method of Proposition (2.7) to show that the ideal I from (2.4) 
is a radical ideal. 

b. Give a non-computational proof of the statement from part a using the 
following observation. By the form of the generators of each of the ideals 
Ij in Exercise 7, V(/j) is a single point and Ij is the ideal l(V{Ij)), As 
a result, Ij = y/Tj by the Strong Nullstellensatz. Then use the general 
fact about intersections of radical ideals from part a Exercise 9 from §4 
of Chapter 1. 

Exercise 9. This exercise is used in the proof of Proposition (2.7). Suppose 
we have an ideal I C k[x \^ . . . , Xn]-, and let p = {x\ — ai) •• • {xi — ad)^ 
where ai, . . . , are distinct. The goal of this exercise is to prove that 

i+ip) = “ “j))- 

3 

a. Prove that / + (p) C p\j{I + {xi — aj)). 

b. Let pj = Yli^jixi - tti). Prove that Pj • (/ + {xi - aj)) C / + (p). 

c. Show that Pi , . . . , Pn are relatively prime, and conclude that there are 
polynomials /ii, ..., /in such that 1 = Ylj 

d. Prove that p|^ (/+ (xi — a^)) C 7+ (p). Hint: Given h in the intersection, 
write h = hjPjh and use part b. 

Exercise 10. (The Dual Space of fc[xi, . . . , Xn]/I) Recall that if U is a 
vector space over a field A;, then the dual space of F, denoted U*, is the 
fc- vector space of linear mappings L : V k. If V is finite-dimensional, 
then so is V*, and dimF = dimU*. Let / be a zero-dimensional ideal in 
k[xi, . . . , Xn], and consider A = k[xi, . . . , Xn]/I with its A;-vector space 
structure. Let G be a Grobner basis for I with respect to some monomial 
ordering, and let B _ . . . , be the corresponding monomial 

basis for A, so that for each / G k[xi , . . . , Xn], 

j=i 

for some Cj{f) G k. 

a. Show that each of the functions Cj(/) is a linear function of / G 
fc[xi, . . . , Xn]- Moreover, show that Cj{f) = 0 for all j if and only if 
/ G /, or equivalently [/] = 0 in A. 

b. Deduce that the collection B* of mappings Cj given by / i-^ Cj(/), 
j = 1, . . . , d gives a basis of the dual space A*. 

c. Show that B* is the dual basis corresponding to the basis B of A. That 
is, show that 



c(x“W) = /j 

0 otherwise. 
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Exercise 11. Let 5 = {pi, . . . ,Prn} be a finite subset of 

a. Show that there exists a linear polynomial . . . , whose values 
at the points of S are distinct. 

b. Using the linear polynomial £ from part a, show that there exist 
polynomials gi G C[xi, . . . , x^], 2 = 1, . . . , m, such that 

, . f 0 if i ^ j, and 

Hint: Mimic the construction of the Lagrange interpolation polynomials 
in the discussion after the statement of Lemma (2.9). 

Exercise 12. As in Theorem (2.10), suppose that V(7) = {pi, . . . ,Pm}- 

a. Prove that the map (p : C[xi, . . . , Xn]/ 1 given by evaluation at 

Pi 7 • • • j Pm is a well-defined linear map. Hint: [/] = [g] implies f — g e I. 

b. We can regard as a ring with coordinate-wise multiplication. Thus 

(o-l, • . . , dm) ’ (^l5 • • • 7 ^m) “ (^1^17 • • • 7 dfYibfii). 

With this ring structure, is a direct product of m copies of C. Prove 
that the map (p of part a is a ring homomorphism. 

c. Prove that (p is a ring isomorphism if and only if I is radical. This 
means that in the radical case, we can express A as a direct product 
of the simpler rings (namely, m copies of C). In Chapter 4, we will 
generalize this result to the nonradical case. 

Exercise 13. The following (very rudimentary) Maple procedure auto- 
mates the process of finding the monomial basis B for a the quotient algebra 
A = k[xi, . . . , Xn]/I for a zero-dimensional ideal I. 

kbasis := proc(PList ,VList ,torder) 

# returns a list of monomials forming a basis of the quotient 

# ring k[VList]/<PList> if the ideal is O-dimensional, and 

# generates an error if it is not 

local B,C,G,v,t ,l,m; 

if f initeCPList ,VList) then 
G := gbasis(PList,VList,t order) ; 

B := [1]; 

for V in VList do 

m := degree(finduni(v,G, VList) ,v) ; 

C := B; 
for t in C do 
for 1 to m-1 do 
t := t*v; 
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if normalf (t,G,VList,torder) = t then 
B := [op(B) , t] 
fi; 

od; 

od; 

od; 

RETURN (B) 
else 

ERRGR(’ Ideal is not zero-dimensional, no finite basis O 
fi 
end: 

a. Show that this procedure correctly computes {x^ : ^ (lt(/))} if A 

is a finite-dimensional vector space over fc, and terminates for all inputs. 

b. Use this kbasis procedure to check the results for the ideal from (2.4). 

c. Use this procedure to check your work from Exercise 6 above. 

Exercise 14. The algorithm used in the procedure from Exercise 13 can 
be improved considerably. The “box” R that kbasis searches for elements 
of the complement of is often much larger than necessary. This is 

because the call to f induni, which finds a monic generator for I H k[xi] 
for each gives an rrii such that x'^^ G (lt(/)), but rrii might not be as 
small as possible. For instance, consider the ideal I from (2.4). The monic 
generator of / fl C[a;] has degree 4 (check this). Hence kbasis computes 

x^ , x^ and rejects these monomials since they are not remainders. But 
the Grobner basis G given in (2.4) shows that G (lt(/)). Thus a smaller 
set of a containing the exponents of the monomial basis B can be de- 
termined directly by examining the leading terms of the Grobner basis G, 
without using f indimi to get the monic generator for Ink[xi], Develop and 
implement an improved kbasis that takes this observation into account. 

Exercise 15. Using your improved kbasis procedure, develop and 
implement a procedure that computes the multiplication table for a 
finite-dimensional algebra A. 

Exercise 16. Implement the following Maple procedure for finding the 
radical of a zero-dimensional ideal given by Proposition (2.7) and test it on 
the examples from this section. 

zdimradical := proc(PList ,VList) 

# constructs a set of generators for the radical of a 

# zero-dimensional ideal. 



local p,pred,v,RList ; 
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if f initeCPList jVList) then 
RList :=PList; 
for V in VList do 

p :=finduni(v,PList,VList); 

pred := simplify(p/gcd(p,diff (p,v))) ; 

RList := [op (RList) ,pred] ; 
od; 

RETURN (RList) 
else 

ERR0R(^ Ideal not zero-dimensional; method does not apply ‘) 
fi 
end: 



§3 Grobner Basis Conversion 

In this section, we will use linear algebra in ^4 = k[x\, . . . , Xn\II to show 
that a Grobner basis G for a zero-dimensional ideal I with respect to one 
monomial order can be converted to a Grobner basis C?' for the same ideal 
with respect to any other monomial order. The process is sometimes called 
Grobner basis conversion^ and the idea comes from a paper of Faugere, 
Gianni, hazard, and Mora [FGLM]. We will illustrate the method by con- 
verting from an arbitrary Grobner basis G to a lex Grobner basis Giex 
(using any ordering on the variables). The Grobner basis conversion method 
is often used in precisely this situation, so that a more favorable monomial 
order (such as grevlex) can be used in the application of Buchberger’s al- 
gorithm, and the result can then be converted into a form more suited for 
equation solving via elimination. For another discussion of this topic, see 
[BW], §1 of Chapter 9. 

The basic idea of the Faugere-Gianni-Lazard-Mora algorithm is quite 
simple. We start with a Grobner basis G for a zero-dimensional ideal /, 
and we want to convert G to a lex Grobner basis Giex for some lex order. 
The algorithm steps through monomials in fc[xi, . . . , Xn] in increasing lex 
order. At each step of the algorithm, we have a list Giex = {^i? • • • ? of 
elements in I (initially empty, and at each stage a subset of the eventual 
lex Grobner basis), and a list Biex of monomials (also initially empty, and 
at each stage a subset of the eventual lex monomial basis for A), For each 
input monomial x^ (initially 1), the algorithm consists of three steps: 

(3.1) Main Loop. Given the input x", compute 5^^. Then: 
a. If 5^^ is linearly dependent on the remainders (on division by G) of the 
monomials in Biex, then we have a linear combination 

a ^ — rrG 

x^ — = 0 , 
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where G Bux and Cj G k. This implies that 
g = x°‘ - 6 I. 

We add g to the list Giex as the last element. Because the x" are con- 
sidered in increasing Zex order (see (3.3) below), whenever a polynomial 
g is added to Giex, its leading term is lt(^) = x" with coefficient 1. 
b. If x"^ is linearly independent from the remainders (on division by G) 
of the monomials in Biex^ then we add x" to Biex as the last element. 

After the Main Loop acts on the monomial x", we test Giex to see if we 
have the desired Grobner basis. This test needs to be done only if we added 
a polynomial g to Giex in part a of the Main Loop. 

(3.2) Termination Test. If the Main Loop added a polynomial g to Gux^ 
then compute lt(^). If ur{g) is a power of Xi, where xi is the greatest 
variable in our lex order, then the algorithm terminates. 

The proof of Theorem (3.4) below will explain why this is the correct way 
to terminate the algorithm. If the algorithm does not stop at this stage, we 
use the following procedure to find the next input monomial for the Main 
Loop: 

(3.3) Next Monomial. Replace x" with the next monomial in lex order 
which is not divisible by any of the monomials hT^gi) for gi G Giex- 

Exercise 3 below will explain how the Next Monomial procedure works. 
Now repeat the above process by using the new x" as input to the Main 
Loop, and continue until the Termination Test tells us to stop. 

Before we prove the correctness of this algorithm, let’s see how it works 
in an example. 

Exercise 1. Consider the ideal 

I = {xy -h z — xz, x^ — Zj 2x^ — x^yz — 1) 

in Q[x, 2 /, 2 :]. For grevlex order with x > y > z, I has a Grobner basis 
G = {/i,/ 2 ,/ 3 ,/ 4 >, where 

fi=z^- iz^ - Ayz 2z^ - y 2z - 2 

f2 = - 2 ^:^ + 1 

h = y^ - ‘^yz z^^ - z 
f 4 = X + y - z. 
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Thus (lt(/)) = 2/^^, 2/^, x), B = {l^y^ ^ ^yz}^ and a remainder 

^ 

/ is a linear combination of elements of B. We will use basis conversion 
to find a lex Grobner basis for /, with z > y > x. 

a. Carry out the Main Loop for x" = 1, x, x^, x^, x^, x^, x®. At the end of 
doing this, you should have 

Giex = - X^ - 2X^ + 1} 

Blex = {l,a:,x^,x^,x^,x^}. 

Hint: The following computations will be useful: 



T^ = l 



-y + z 



= z 



-yz + 



x4® = 



— 3 

X® =z'^ 

— 3 
X® = z^. 



2yz -2z^ + 1 



^ ^ ^ 

Note that 1 , . . . , x^ are linearly independent while x® is a linear 

combination of x^ , x^ and 1^. This is similar to Exercise 2 of §2. 

b. After we apply the Main Loop to x®, show that the monomial provided 
by the Next Monomial procedure is 2/, and after y passes through the 
Main Loop, show that 

Giex = - x^ - 2x^ + 1, 2/ - 4- x} 

Blex = {l,x,x^,x^,x'^,x^}. 

c. Show that after 2/, Next Monomial produces z, and after 2 : passes through 
the Main Loop, show that 

Giex = {x^ - x^ — 2x^ H- 1, 2/ - x^ 4- X, z — x^} 

Blex = {l,x, X^,X^,X^,X^}. 

d. Check that the Termination Test (3.2) terminates the algorithm when 
Giex is as in part c. Hint: We’re using lex order with z > y > x. 

e. Verify that Giex from part c is a lex Grobner basis for I, 



We will now show that the algorithm given by (3.1), (3.2) and (3.3) 
terminates and correctly computes a lex Grobner basis for the ideal I. 



(3.4) Theorem. The algorithm described above terminates on every in- 
put Grobner basis G generating a zero-dimensional ideal /, and correctly 
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computes a lex Grobner basis Giex for I and the lex monomial basis Biex 
for the quotient ring A. 

Proof. We begin with the key observation that monomials are added 
to the list Biex in strictly increasing lex order. Similarly, if Giex = 
then 

LT(gi) <i 

ex * * * “^lex nr(^fc), 

where >iex is the lex order we are using. We also note that when the Main 
Loop adds a new polynomial gk-\-i to Giex = {^i? • • • ,9k}j the leading 
term LT(pfc+i) is the input monomial in the Main Loop. Since the input 
monomials are provided by the Next Monomial procedure, it follows that 
for all k, 

(3.5) LT(^fc+i) is divisible by none of lt(^i), . . . , lt(^^). 

We can now prove that the algorithm terminates for all inputs G gener- 
ating zero-dimensional ideals. If the algorithm did not terminate for some 
input G, then the Main Loop would be executed infinitely many times, so 
one of the two alternatives in (3.1) would be chosen infinitely often. If the 
first alternative were chosen infinitely often, Giex would give an infinite list 
lt(^i), LT(^f 2 ), ... of monomials. However, we have: 

• (Dickson’s Lemma) Given an infinite list of monomials 

in k[xi, . . . , Xn], there is an integer N such that every is divisible 
by one of x^^^\ . . . , x^^^\ 

(See, for example. Exercise 7 of [CLO], Chapter 2, §4). When applied to 
ur{gi), lt(^ 2 )j • • -j Dickson’s Lemma would contradict (3.5). On the other 
hand, if the second alternative were chosen infinitely often, then Biex would 
give infinitely many monomials x^^^^ whose remainders on division by G 
were linearly independent in A. This would contradict the assumption that 
I is zero-dimensional. As a result, the algorithm always terminates for G 
generating a zero-dimensional ideal /. 

Next, suppose that the algorithm terminates with Giex = {9u • • • 1 9k}- 
By the Termination Test (3.2), ur{gk) = where x\ >iex * * * >iex Xn- 
We will prove that Giex is a lex Grobner basis for / by contradiction. 
Suppose there were some 9^1 such that lt(^) is not a multiple of any of 
the LT(^i), z = 1, . . . , fc. Without loss of generality, we may assume that g 
is reduced with respect to Giex (replace g by 

If LT(^f) is greater than LT{gk) = then one easily sees that lt(^) is 
a multiple of hT{gk) (see Exercise 2 below). Hence this case can’t occur, 
which means that 



ur{gi) < ur{g) < Lx(^^i+i) 

for some i < k. But recall that the algorithm places monomials into Biex 
in strictly increasing order, and the same is true for the ur{gi). All the 
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non-leading monomials in g must be less than lt(^) in the lex order. They 
are not divisible by any of LT(^j) for j < i, since g is reduced. So, the non- 
leading monomials that appear in g would have been included in Biex by 
the time lt( 5 ') was reached by the Next Monomial procedure, and g would 
have been the next polynomial after gi included in Giex by the algorithm 
(i.e., g would equal gi^\). This contradicts our assumption on g^ which 
proves that Giex is a lex Grobner basis for I. 

The final step in the proof is to show that when the algorithm terminates, 
Biex consists of all basis monomials determined by the Grobner basis Giex- 
We leave this as an exercise for the reader. □ 

Additional Exercises for §3 

Exercise 2. Consider the lex order with x\ > - - • > Xn and fix a power 
Xi of X\. Then, for any monomial x^ in k[x\^ . . . , Xn]-> prove that x^ > x^ 
if and only if x^ is divisible by xf. 

Exercise 3. Suppose G^x = • ,9k}, where LT(5fi) < • • • < ur{gk), 

and let x^ be a monomial. This exercise will show how the Next Monomial 
(3.3) procedure works, assuming that our lex order satisfies xi > • • • > Xn- 
Since this procedure is only used when the Termination Test fails, we can 
assume that UT{gk) is not a power of x\. 

a. Use Exercise 2 to show that none of the LT(^i) divide 

b. Now consider the largest 1 < k < n such that none of the LT{gi) divide 
the monomial 



^k-i 



^k 



By part a, fc = 1 has this property, so there must be a largest such k. If 
x^ is the monomial corresponding to the largest fc, prove that x^ > x^ 
is the smallest monomial (relative to our lex order) greater than x^ 
which is not divisible by any of the ur{gi). 



Exercise 4. Complete the proof of Theorem (3.4) by showing that when 
the basis conversion algorithm terminates, the set Biex gives a monomial 
basis for the quotient ring A. 



Exercise 5. Use Grobner basis conversion to find lex Grobner bases for 
the ideals in Exercises 6 and 7 from §1. Compare with your previous results. 



Exercise 6. What happens if you try to apply the basis conversion algo- 
rithm to an ideal that is not zero-dimensional? Can this method be used 
for general Grobner basis conversion? What if you have more information 
about the lex basis elements, such as their total degrees, or bounds on those 
degrees? 
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Exercise 7. Show that the output of the basis conversion algorithm is 
actually a monic reduced lex Grobner basis for I = (G). 

Exercise 8. Implement the basis conversion algorithm outlined in (3.1), 
(3.2) and (3.3) in a computer algebra system. Hint: Exercise 3 will be useful. 
For a more complete description of the algorithm, see pages 428-433 of 
[BW]. 



§4 Solving Equations via Eigenvalues 

The central problem of this chapter, finding the solutions of a system of 
polynomial equations /i = /2 = • • • = /s = 0 over C, rephrases in fancier 
language to finding the points of the variety V(/), where I is the ideal 
generated by /i, . . . , /«. When the system has only finitely many solutions, 
i.e., when V(7) is a finite set, the Finiteness Theorem from §2 says that 
/ is a zero-dimensional ideal and the algebra A = C[xi, . . . ^Xn]/I is a 
finite-dimensional vector space over C. The first half of this section exploits 
the structure of A in this case to evaluate an arbitrary polynomial / at 
the points of V(7); in particular evaluating the polynomials f = Xi gives 
the coordinates of the points (Corollary (4.6) below). The values of / on 
V(7) turn out to be eigenvalues of certain linear mappings on A, and the 
remainder of the section discusses techniques for evaluating them. 

We begin with the easy observation that given a polynomial / G 
C[xi, . . . , Xn], we can use multiplication to define a linear map m/ from 
A = C[a:i, . . . , Xn]/I to itself. More precisely, / gives the coset [/] G A, 
and we define m/ : A ^ Aby the rule: if [g] G A, then 

mf{[g]) = [f] ■ [5] = [fg] e A. 

Then m/ has the following basic properties. 

(4.1) Proposition. Let f G C[xi, . . . , Xn]- Then 

a. The map rrif is a linear mapping from A to A. 

b. We have rrif = mg exactly when f — g G I. Thus two polynomials give 
the same linear map if and only if they differ by an element of I . In 
particular f mf is the zero map exactly when f £ I- 

Proof. The proof of part a is just the distributive law for multiplication 
over addition in the ring A. If [^], [h] G A and c £ k, then 

mfic[g\ + W) = [/] • {c[g] + [ft]) - c[f] ■ [g] + [/] • [ft] = cmf{[g]) + m/([ft]). 

Part b is equally easy. Since [1] G A is a multiplicative identity, if m/ = m^, 
then 



[/] = [/] • [1] = m/([l]) = m,([l]) = [5] • [1] = [ff], 
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so f — g E I. Conversely, if f — g G /, then [/] = [g] in A, so m/ = rrig. □ 

Since ^ is a finite-dimensional vector space over C, we can represent ruf 
by its matrix with respect to a basis. For our purposes, a monomial basis 
B such as the ones we considered in §2 will be the most useful, because 
once we have the multiplication table for the elements in B, the matrices 
of the multiplication operators rrif can be read off immediately from the 
table. We will denote this matrix also by m/, and whether rrif refers to the 
matrix or the linear operator will be clear from the context. Proposition 
(4.1) implies that rrif — rrijo , so that we may assume that / is a remainder. 

For example, for the ideal I from (2.4) of this chapter, the matrix for the 
multiplication operator by / may be obtained from the table (2.5) in the 
usual way. Ordering the basis monomials as before, 

B = {l,x,y,xy,y^}, 

we make a 5 x 5 matrix whose jth column is the vector of coefficients in the 
expansion in terms of B of the image under rrif of the jth basis monomial. 
With f = X, for instance, we obtain 



/o 


0 


0 


0 


o\ 


1 


3/2 


0 


-3/2 


1 


0 


3/2 


0 


-1/2 


0 


0 


-3/2 


1 


3/2 


0 


^0 


-1/2 


0 


3/2 


0^ 



Exercise 1. Find the matrices mi^rriy^ rrixy-y 2 with respect to B in this 
example. How do rriy 2 and (rriyY compare? Why? 

We note the following useful general properties of the matrices rrif (the 
proof is left as an exercise). 

(4.2) Proposition. Let /, g be elements of the algebra A. Then 

a. mf^g = ruf -h rUg. 

b. ruf.g — ruf • nfig (where the product on the right means composition of 
linear operators or matrix multiplication). 

This proposition says that the map sending / G C[xi, . . . ^Xn] to the 
matrix m/ defines a ring homomorphism from C[a;i, . . . , Xn] to the ring 
Mixd(C) of d X d matrices, where d is the dimension of A as a C-vector 
space. Furthermore, part b of Proposition (4.1) and the Fundamental 
Theorem of Homomorphisms show that [/]•-> m/ induces a one-to-one ho- 
momorphism A Mdxd{C). A discussion of ring homomorphisms and the 
Fundamental Theorem of Homomorphisms may be found in Chapter 5, §2 
of [CLO], especially Exercise 16. But the reader should note that Mdxd(C) 
is not a commutative ring, so we have here a slightly more general situation 
than the one discussed there. 
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For use later, we also point out a corollary of Proposition (4.2). Let h{t) = 
SHo ^ ^ polynomial. The expression h{f) = makes 

sense as an element of C[xi , . . . , Xn]- Similarly h{mf) = YllLo is 

a well-defined matrix (the term cq should be interpreted as cq/, where I is 
the d X d identity matrix). 

(4.3) Corollary. In the situation of Proposition (4-2), let h E C[t] and 
f G C[xi, . . . ,Xn]. Then 

mh(/) = h{mf). 

Recall that a polynomial / G C[xi ^ . . . , x„] gives the coset [/] G A. Since 
A is finite-dimensional, as we noted in §2 for / = Xi^ the set {1, [/], [/]^, . . .} 
must be linearly dependent in the vector space structure of A. In other 
words, there is a linear combination 

m 

= [ 0 ] 

i =0 

in A, where q G C are not all zero. By the definition of the quotient ring, 
this is equivalent to saying that 

m 

(4.4) 

2=0 

Hence YllLo ^iP vanishes at every point of V(7). 

Now we come to the most important part of this discussion, culminating 
in Theorem (4.5) and Corollary (4.6) below. We are looking for the points in 
V(7), I a zero-dimensional ideal. Let h{t) G C[^], and let / G C[xi, . . . , Xn]- 
By Corollary (4.3), 

h{mf) = 0 h{[f]) = [0] in A. 

The polynomials h such that h(m/) = 0 form an ideal in C[^] by the 
following exercise. 

Exercise 2. Given a. d x d matrix M with entries in a field k, consider 
the collection Im of polynomials h{t) in k[t] such that h{M) = 0, the dx d 
zero matrix. Show that Im is an ideal in k[t]. 

The nonzero monic generator Hm of the ideal Im is called the minimal 
polynomial of M. By the basic properties of ideals in fe[t], if h is any poly- 
nomial with h{M) = 0, then the minimal polynomial hM divides h. In 
particular, the Cayley-Hamilton Theorem from linear algebra tells us that 
Hm divides the characteristic polynomial of M. As a consequence, if fc = C, 
the roots of Hm are eigenvalues of M. Furthermore, all eigenvalues of M 
occur as roots of the minimal polynomial. See [Her] for a more complete 
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discussion of the Cayley-Hamilton Theorem and the minimal polynomial 
of a matrix. 

Let hf denote the minimal polynomial of the multiplication operator m/ 
on A. We then have three interesting sets of numbers: 

• the roots of the equation hf{t) = 0, 

• the eigenvalues of the matrix m/, and 

• the values of the function / on V(/), the set of points we are looking 
for. 

The amazing fact is that all three sets are equal. 

(4.5) Theorem. Let I C C[a;i, . . . , be zero-dimensional, let f G 
C[x\, . . , ,Xn], and let hf be the minimal polynomial of ruf on A = 
C[a?i, . . . , Xn\/I- Then, for A G C, the following are equivalent: 

a. X is a root of the equation hf{t) = 0, 

b. A is an eigenvalue of the matrix mj, and 

c. X is a value of the function f on Y (/) . 

Proof, a b follows from standard results in linear algebra. 

b c: Let A be an eigenvalue of m/. Then there is a corresponding 
eigenvector [z] ^ [0] £ A such that [/ — X][z] = [0]. Aiming for a con- 
tradiction, suppose that A is not a value of / on V(7). That is, letting 
V(7) = {pi, . . . ,Pm}? suppose that f{pi) ^ A for alH = 1, . . . , m. 

Let g — f — X, so that g{pi) ^ 0 for all i. By Lemma (2.9) of this 
chapter, there exist polynomials gi such that gi{pj) = 0 if i ^ j, and 
gi(pi) = 1. Consider the polynomial g' = ^/9{Pi)9i- It follows that 

g'{Pi)g{Pi) = 1 for all i, and hence 1 — g^g e I(V(7)). By the Nullstellen- 
satz, (1 — g'gY G 7 for some £ > 1. Expanding by the binomial theorem 
and collecting the terms that contain p as a factor, we get 1 — gg £ I for 
some g £ C[xi, . . . ,Xn]- In A, this last inclusion implies that [1] = [p][p], 
hence g has a multiplicative inverse [g] in A. 

But from the above we have [g][z] = [/ — X][z] = [0] in A. Multiplying 
both sides by [g], we obtain [z] = [0], which is a contradiction. Therefore 
A must be a value of / on V(7). 

c a: Let A = f{p) for p £ V(7). Since hf{mf) = 0, Corollary (4.3) 
shows hf{[f]) = [0], and then (4.4) implies hf{f) £ 7. This means h/(/) 
vanishes at every point of V(7), so that h/(A) = hf{f{p)) = 0. □ 

Exercise 3. We saw earlier that the matrix of multiplication by x in the 
5-dimensional algebra A = C[a;, y]/I from (2.4) of this chapter is given by 
the matrix displayed before Exercise 1 in this section, 
a. Using the minpoly command in Maple (part of the linalg package) or 
otherwise, show that the minimal polynomial of this matrix is 

h^{t) = - 2t^ -t^ + 2t. 
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The roots of hx{t) = 0 are thus t = 0, —1, 1, 2. 

b. Now find all points of V(J) using the methods of §1 and show that that 
the roots of hx are exactly the distinct values of the function /(x, t/) = x 
at the points of V(/). (Two of the points have the same x-coordinate, 
which explains why the degree and the number of roots are 4 instead of 
5!) Also see Exercise 7 from §2 to see how the ideal I was constructed. 

c. Finally, find the minimal polynomial of the matrix rriy, determine its 
roots, and explain the degree you get. 

When we apply Theorem (4.5) with / = x^, we get a general result 
exactly parallel to this example. 

(4.6) Corollary. Let I C C[xi,...,Xn] be zero- dimensional. Then the 
eigenvalues of the multiplication operator on A coincide with the 
Xi- coordinates of the points of V(J). Moreover, substituting t = X{ in 
the minimal polynomial hxi yields the unique monic generator of the 
elimination ideal I fl C[xi]. 

Corollary (4.6) indicates that it is possible to solve equations by com- 
puting eigenvalues of the multiplication operators rrixi . This fact has been 
studied recently by Stetter [Ste], Moller [Mol], and others. As a result a 
whole array of numerical methods for approximating eigenvalues can be 
brought to bear on the root-finding problem, at least in favorable cases. 
We include a brief discussion of some of these methods for the convenience 
of some readers; the following two paragraphs may be safely ignored if 
you are familiar with numerical eigenvalue techniques. For more details, we 
suggest [BuF] or [Act]. 

In elementary linear algebra, eigenvalues of a matrix M are usually 
determined by solving the characteristic polynomial equation: 

det(M - tl) = 0. 

The degree of the polynomial on the left hand side is the size of the matrix 
M. But computing det(M — tl) for large matrices is a large job itself, and 
as we have seen in §1, exact solutions (and even accurate approximations 
to solutions) of polynomial equations of high degree over R or C can be 
hard to come by, so the characteristic polynomial is almost never used in 
practice. So other methods are needed. 

The most basic numerical eigenvalue method is known as the power 
method. It is based on the fact that if a matrix M has a unique dom- 
inant eigenvalue (i.e., an eigenvalue A satisfying |A| > l/ij for all other 
eigenvalues p of M), then starting from a randomly chosen vector xq, and 
forming the sequence 



Xfc-f-i = unit vector in direction of Mx^, 
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we almost always approach an eigenvector for A as A: — > oo. An approxi- 
mate value for the dominant eigenvalue A may be obtained by computing 
the norm HMxfcH at each step. If there is no unique dominant eigenvalue, 
then the iteration may not converge, but the power method can also be 
modified to eliminate that problem and to find other eigenvalues of M. In 
particular, we can find the eigenvalue of M closest to some fixed s by ap- 
plying the power method to the matrix M' = (M — sl)~^. For almost all 
choices of s, there will be a unique dominant eigenvalue of M'. Moreover, if 
A' is that dominant eigenvalue of M', then 1/A' + s is the eigenvalue of M 
closest to s. This observation makes it possible to search for all the eigen- 
values of a matrix as we would do in using the Newton-Raphson method to 
find all the roots of a polynomial. Some of the same difficulties arise, too. 
There are also much more sophisticated iterative methods, such as the LR 
and QR algorithms, that can be used to determine all the (real or complex) 
eigenvalues of a matrix except in some very uncommon degenerate situa- 
tions. It is known that the QR algorithm, for instance, converges for all 
matrices having no more than two eigenvalues of any given magnitude in 
C. Some computer algebra systems (e.g.. Maple and Mathematica) provide 
built-in procedures that implement these methods. 

A legitimate question at this point is this: Why might one consider apply- 
ing these eigenvalue techniques for root finding instead of using elimination? 
There are two reasons. 

The first concerns the amount of calculation necessary to carry out this 
approach. The direct attack — solving systems via elimination as in §1 — 
imposes a choice of monomial order in the Grobner basis we use. Pure 
lex Grobner bases frequently require a large amount of computation. As 
we saw in §3, it is possible to compute a grevlex Grobner basis first, then 
convert it to a lex basis using the Faugere-Gianni-Lazard-Mora basis con- 
version algorithm, with some savings in total effort. But basis conversion 
is unnecessary if we use Corollary (4.6), because the algebraic structure of 
C[xi^ . . . ^Xn]/I is independent of the monomial order used for the Grobner 
basis and remainder calculations. Hence any monomial order can be used 
to determine the matrices of the multiplication operators . . 

The second reason concerns the amount of numerical versus symbolic 
computation involved, and the potential for numerical instability. In the 
frequently-encountered case that the generators for I have rational coef- 
ficients, the entries of the matrices nixi will also be rational, and hence 
can be determined exactly by symbolic computation. Thus the numerical 
component of the calculation is restricted to the eigenvalue calculations. 

There is also a significant difference even between a naive first idea for 
implementing this approach and the elimination method discussed in §1. 
Namely, we could begin by computing all the rUxi and their eigenvalues 
separately. Then with some additional computation we could determine 
exactly which vectors (a:i, . . . , x^) formed using values of the coordinate 
functions actually give approximate solutions. The difference here is that 
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the computed values of xi are not used in the determination of the Xj^ 
j ^ i In §1, we saw that a major source of error in approximate solutions 
was the fact that small errors in one variable could produce larger errors 
in the other variables when we substitute them and use the Extension 
Theorem. Separating the computations of the values Xi from one another, 
we can avoid those accumulated error phenomena (and also the numerical 
stability problems encountered in other non-elimination methods). 

We will see shortly that it is possible to reduce the computational effort 
involved even further. Indeed, it suffices to consider the eigenvalues of only 
one suitably-chosen multiplication operator mciaji+ . +cnXn- Before devel- 
oping this result, however, we present an example using the more naive 
approach. 

Exercise 4. We will apply the ideas sketched above to find approximations 
to the complex solutions of the system: 

0 = x^ — 2xz + 5 
0 = xy^ -\-yz 1 

0 = — Sxz. 

a. First, compute a Grobner basis to determine the monomial basis for the 
quotient algebra. We can use the grevlex (Maple tdeg) monomial order 
and the kbasis procedure from Exercise 13 in §2: 

PList := [x"2 “ 2*x*z + 5, x*y"2 + y*z + 1, 3*y"2 - 8*x*z] ; 
VList := [x,y,z] : 

G := gbasis (PList , VList, tdeg) ; 

B ;= kbasis (G, VList , tdeg) ; 

and obtain the 8 monomials: 

[l,a:, y, xy,z,z^,xz, yz], 

(You should compare this with the output of kbasis for lex order. Also 
print out the lex Grdbner basis for this ideal if you have a taste for 
complicated polynomials.) 

b. Using the monomial basis check that the matrix of the full 
multiplication operator rrix is 
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c. Now, applying the numerical eigenvalue routine eigenvals from Maple, 
check that there are two approximate real eigenvalues: 

-1.100987715, .9657124563, 

and 3 complex conjugate pairs. (These values agree with the results 
of finding the monic generator of / D C[a:] and doing numerical root 
finding.) 

d. Complete the calculation by finding the multiplication operators m^, 
rriz, computing their real eigenvalues, and determining which triples 
(x, y, z) give solutions. (There are exactly two real points.) Also see 
Exercises 9 and 10 below for a second way to compute the eigenvalues 
of rrixi rriy^ and rriz. 

In addition to eigenvalues, there are also eigenvectors to consider. In fact, 
every matrix M has two sorts of eigenvectors. The left eigenvectors of M 
are the usual ones, which are column vectors v ^ 0 such that 

M V = Xv 

for some A G C. Since the transpose has the same eigenvalues A as M, 
we can find a column vector v' ^ 0 such that 

v' = Xv' . 

Taking transposes, we can write this equation as 

w M = Xw, 

where w = is a row vector. We call w a right eigenvector of M. 

The left and right eigenvectors for a matrix are connected in the following 
way. For simplicity, suppose that M is a diagonalizable n x n matrix, so 
that there is a basis for consisting of left eigenvectors for M. In Exercise 
7 below, you will show that there is a matrix equation MQ = QD^ where 
Q is the matrix whose columns are the left eigenvectors in a basis for 
C’^, and D is a, diagonal matrix whose diagonal entries are the eigenvalues 
of M. Rearranging the last equation, we have Q~^M = DQ~^. By the 
second part of Exercise 7 below, the rows of are a collection of right 
eigenvectors of M that also form a basis for C^. 

For a zero-dimensional ideal 7, there is also a strong connection between 
the points of V(7) and the right eigenvectors of the matrix m/ relative to 
the monomial basis B coming from a Grobner basis. We will assume that 
I is radical. In this case. Theorem (2.10) implies that A has dimension m, 
where m is the number of points in V (7). Hence, we can write the monomial 
basis B as the cosets 

Using this basis, let m/ be the matrix of multiplication by /. We can relate 
the right eigenvectors of m/ to points of V(7) as follows. 
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(4,7) Proposition. Suppose f G C[xi, . . . , 0:^] is chosen such that the 
values f{p) are distinct for p G V(/). Then the right eigenspaces of 
the matrix ruf are 1-dimensional and are spanned by the row vectors 
forp G V(/). 

Proof. If we write m/ = (mij), then for each j between 1 and m, 

H h 

Now fix p G V(/i, . . . , /n) and evaluate this equation at p to obtain 

p«0)/(p) = + • • • + 

(this makes sense by Exercise 12 of §2). Doing this for j = 1, . . . , m gives 

ruf. 

Exercise 14 at the end of the section asks you to check this computation 
carefully. Note that one of the basis monomials in B is the coset [1] (do 
you see why?), which shows that is nonzero and hence is 

a right eigenvector for m/, with f{p) as the corresponding eigenvalue. 

By hypothesis, the /(p) are distinct for p G V(/), which means that the 
m X m matrix m/ has m distinct eigenvalues. Linear algebra then implies 
that the corresponding eigenspaces (right and left) are 1-dimensional. □ 

This proposition can be used to find the points in V(7) for any zero- 
dimensional ideal I. The basic idea is as follows. First, we can assume that 
/ is radical by replacing I with y/l as computed by Proposition (2.8). Then 
compute a Grobner basis G and monomial basis B as usual. Now consider 
the function 



/ — CiXi + • • • + CnXn, 

where ci, . . . , Cn are randomly chosen integers. This will ensure (with small 
probability of failure) that the values /(p) are distinct for p G V(/). Rel- 
ative to the monomial basis B, we get the matrix m/, so that we can 
use standard numerical methods to find an eigenvalue A and correspond- 
ing right eigenvector v of mf. This eigenvector, when combined with the 
Grobner basis G, makes it trivial to find a solution p G V(7). 

To see how this is done, first note that Proposition (4.7) implies 

(4.8) t; = 

for some nonzero constant A and some p € V(/). Write p = (ai, . . . , o„). 
Our goal is to compute the coordinates ai of p in terms of the coordinates 
of V. Equation (4.8) implies that each coordinate of v is of the form Xp^^^K 
The Finiteness Theorem implies that for each i between 1 and n, there is 
rrii > 1 such that x^^ is the leading term of some element of G. If > 1, 
it follows that [x^] G B (do you see why?), so that Aa^ is a coordinate of 
v. As noted above, we have [1] G B, so that A is also a coordinate of v. 
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Consequently, 

XCLi 
Ui = — 

is a ratio of coordinates of v. This way, we get the a^^-coordinate of p for 
all i satisfying > 1. 

It remains to study the coordinates with = 1. These variables appear 
in none of the basis monomials in B (do you see why?), so that we turn 
instead to the Grobner basis G for guidance. Suppose the variables with 
rui = 1 are We will assume that the variables are labeled so 

that x\ > • • ‘ > Xn and zi > • • • > In Exercise 15 below, you will show 
that for j = 1, . . . , there are elements Qj G G such that 

Qj = Xi. H- terms involving Xi for i > ij. 

If we evaluate this at p = (ai, . . . , Un), we obtain 

(4.9) 0 = Gi. -h terms involving ai for i > ij. 

Since we already know for 2 ^ {^i, • • • these equations make it 
a simple matter to find . . . , We start with For j = £, (4.9) 
implies that is a polynomial in the coordinates of p we already know. 
Hence we get But once we know (4.9) shows that is also a 
polynomial in known coordinates. Continuing in this way, we get all of the 
coordinates of p. 

Exercise 5. Apply this method to find the solutions of the equations given 
in Exercise 4. The a;-coordinates of the solutions are distinct, so you can 
assume f = x. Thus it suffices to compute the right eigenvectors of the 
matrix of Exercise 4. 

Since the right eigenvectors of m/ help us find solutions in V(/), it is 
natural to ask about the left eigenvectors. In Exercise 17 below, you will 
show that these eigenvectors solve the interpolation problem^ which asks 
for a polynomial that takes preassigned values at the points of V(7). 

This section has discussed several ideas for solving polynomial equations 
using linear algebra. We certainly do not claim that these ideas are a com- 
putational panacea for all polynomial systems, but they do give interesting 
alternatives to other, more traditional methods in numerical analysis, and 
they are currently an object of study in connection with the implementa- 
tion of the next generation of computer algebra systems. We will continue 
this discussion in §5 (where we study real solutions) and Chapter 3 (where 
we use resultants to solve polynomial systems). 



Additional Exercises for §4 
Exercise 6. Prove Proposition (4.2). 
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Exercise 7. Let M, Q, P, D be n x n complex matrices, and assume D is 
a diagonal matrix. 

a. Show that the equation MQ == QD holds if and only if for each j the 
jth column of Q is a left eigenvector of M and the jth diagonal entry 
of D is the corresponding eigenvalue. 

b. Show that the equation PM = DP holds if and only if for each i the 
zth row of P is a right eigenvector of M and the ith diagonal entry of 
D is the corresponding eigenvalue. 

c. If MQ = QD and Q is invertible, deduce that the rows of Q~^ are right 
eigenvectors of M. 

Exercise 8. 

a. Apply the eigenvalue method from Corollary (4.6) to solve the system 
from Exercise 6 of §1. Compare your results. 

b. Apply the eigenvalue method from Corollary (4.6) to solve the system 
from Exercise 7 from §1. Compare your results. 

Exercise 9. Let Vi be the subspace of A spanned by the non-negative 
powers of [xi], and consider the restriction of the multiplication operator 
ruxi : A A to Vi. Assume {1, [a:^], . . . , [xi]^^~^} is a basis for Vi. 

a. What is the matrix of the restriction with respect to this basis? 

Show that it can be computed by the same calculations used in Exer- 
cise 4 of §2 to find the monic generator of / fl C[xi], without computing 
a lex Grobner basis. Hint: See also Exercise 11 of §1 of Chapter 3. 

b. What is the characteristic polynomial of rrixi | and what are its roots? 

Exercise 10. Use part b of Exercise 9 and Corollary (4.6) to give another 
determination of the roots of the system from Exercise 4. 

Exercise 11. Let / be a zero-dimensional ideal in C[xi, . . . , and 
let / G C[xi^ . . . ,Xn]. Show that [/] has a multiplicative inverse in 
C[xi^ . . . ^Xn]/I if and only if f{p) ^ 0 for all p G V(7). Hint: See the 
proof of Theorem (4.5). 

Exercise 12. Prove that a zero-dimensional ideal is radical if and only if 
the matrices rUxi are diagonalizable for each i. Hint: Linear algebra tells 
us that a matrix is diagonalizable if and only if its minimal polynomial is 
square-free. Proposition (2.8) and Corollary (4.6) of this chapter will be 
useful. 

Exercise 13. Let A = C[xi, . . . ,Xn]/I for a zero-dimensional ideal /, 
and let / G C[a;i, . . . , Xn]- If p ^ V(7), we can find g G C[a:i, . . . , Xn] 
with g{p) = 1, and g(p') = 0 for all p' G V(7), p' ^ p (see Lemma (2.9)). 
Prove that there is an 7 > 1 such that the coset [g^] G A is a generalized 
eigenvector for ruf with eigenvalue f{p). (A generalized eigenvector of a 
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matrix M is a nonzero vector v such that {M — \I)'^v = 0 for some m > 1.) 
Hint: Apply the Nullstellensatz to (/ — f{p))g- In Chapter 4, we will study 
the generalized eigenvectors of m/ in more detail. 

Exercise 14. Verify carefully the formula = 

rrif used in the proof of Proposition (4.7). 

Exercise 15. Let > be some monomial order, and assume x± > • • • > Xn- 
If ^ G fcfxi, . . . , Xn] satisfies isr{g) = Xj, then prove that 

g =z Xj + terms involving Xi for i > j. 

Exercise 16. (The Shape Lemma) Let / be a zero-dimensional radical 
ideal such that the Xn-coordinates of the points in V(7) are distinct. Let 
G be a reduced Grobner basis for I relative to a lex monomial order with 
Xn as the last variable. 

a. If V(7) has m points, prove that the cosets 1, [ 2 :^]) • • • ? a.re 

linearly independent and hence are a basis of A = k[xi^ . . . , Xn]II- 

b. Prove that G consists of n polynomials 

9l — ^1 "b ^l(^n) 



9n—l — ^n—\ ^n— l(^n) 

9n — ~b ^n(^n)j 

where hi, . . . , are polynomials in Xn of degree at most m — 1. Hint: 
Start by expressing [xi], . . . , [xn-i], in terms of the basis of part a. 

c. Explain how you can find all points of V (7) once you know their Xn- 
coordinates. Hint: Adapt the discussion following (4.9). 

Exercise 17. This exercise will study the left eigenvectors of the matrix 
nif and their relation to interpolation. Assume that 7 is a zero-dimensional 
radical ideal and that the values f{p) are distinct for p G V(7). We write 
the monomial basis B as . . . , 

a. If p G V(7), Lemma (2.9) of this chapter gives us g such that g{p) = 1 
and g{p') = 0 for all p' ^ p \n V(7). Prove that the coset [g] G A 
is a left eigenvector of m/ and that the corresponding eigenspace has 
dimension 1. Conclude that all eigenspaces of m/ are of this form. 

b. If i; = {vi, . . . ,VmY is a left eigenvector of m/ corresponding to the 
eigenvalue f{p) for p as in part a, then prove that the polynomial 

9 = + • • • + 

satisfies g{p) 0 and g{p') = 0 for p' ^ p in V(7). 
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c. Show that we can take the polynomial g of part a to be 



1 




Thus, once we know the solution p and the corresponding left eigenvector 
of m/, we get an explicit formula for the polynomial g, 
d. Given V(/) = {pi, . . . ,Pm} and the corresponding left eigenvectors of 
m/, we get polynomials pi, . . . , Pm such that gi{pj) = lii i = j and 0 
otherwise. Each gi is given explicitly by the formula in part c. The in- 
terpolation problem asks to find a polynomial h which takes preassigned 
values Ai, . . . , Am at the points pi, . . . ,Pm- This means h{pi) = Xi for 
all i. Prove that one choice for h is given by 

h — Aipi “h * * • "h Am^m* 



Exercise 18. Develop and code an algorithm for computing the matrix of 
ruf on A = k[xi ^ . . . , Xn]/I, given the polynomial /, a list of polynomials 
generating /, a list of variables, and a monomial order. Implement this 
algorithm in a computer algebra system, and call it getmatrix. (In one of 
the exercises in §5, we will use a Maple version of getmatrix.) 



§5 Real Root Location and Isolation 

The eigenvalue techniques for solving equations from §4 are only a first way 
that we can use the results of §2 for finding roots of systems of polynomial 
equations. In this section we will discuss a second application that is more 
sophisticated. We follow a recent paper of Pedersen, Roy, and Szpirglas 
[PRS] and consider the problem of determining the real roots of a system 
of polynomial equations with coefficients in a field A: C M (usually k = 
Q or a finite extension field of Q). The underlying principle here is that 
for many purposes, explicitly determined, bounded regions R G each 
guaranteed to contain exactly one solution of the system can be just as 
useful as a collection of numerical approximations. Note also that if we 
wanted numerical approximations, once we had such an the job of finding 

that one root would generally be much simpler than a search for all of the 
roots! (Think of the choice of the initial approximation for an iterative 
method such as Newton-Raphson.) For one- variable equations, this is also 
the key idea of the interval arithmetic approach to computation with real 
algebraic numbers (see [Mis]). We note that there are also other methods 
known for locating and isolating the real roots of a polynomial system (see 
§8.8 of [BW] for a different type of algorithm). 

To define our regions ii in we will use polynomial functions in the 
following way. Let h G k[xi ^ . . . ,Xn] be a nonzero polynomial. The real 
points where h takes the value 0 form the variety V (/i) We will denote 




64 Chapter 2. Solving Polynomial Equations 



this by VR(h) in the discussion that follows. In typical cases, V]R(ft) will 
be a hypersurface — an (n — l)-dimensional variety in The complement 
of Vm(/i) in is the union of connected open subsets on which h takes 
either all positive values or all negative values. We obtain in this way a 
decomposition of R’^ as a disjoint union 

(5.1) R" = if+U/r UVR(fe), 

where = {a G R’^ : h{a) > 0}, and similarly for H~. Here are some 
concrete examples. 

Exercise 1. 

a. Let ft = + 2 /^ — l)(x^ + 2 /^ — 2) in R[a:, y]. Identify the regions and 

H~ for this polynomial. How many connected components does each of 
them have? 

b. In this part of the exercise, we will see how regions like rectangular 
“boxes” in R’^ may be obtained by intersecting several regions H'^ or 
H~. For instance, consider the box 

R = {(x, 2 /) G R^ : a < X < 6, c < y < d}. 

If fti(x, y) = {x — a){x — b) and h 2 {x^ y) = {y ~ c){y — d), show that 

R — iff n iff = {(x, 2 /) G R^ : fti(x, 2 /) < 0, i = 1, 2}. 

What do Hi, if^ and look like in this example? 

Given a region R like the box from part b of the above exercise, and 
a system of equations, we can ask whether there are roots of the system 
in R. The results of [PRS] give a way to answer questions like this, using 
an extension of the results of §2 and §4. Let f be a zero-dimensional ideal 
and let B be the monomial basis of A = fc[xi, . . . , Xn]/I for any monomial 
order. Recall that the trace of a square matrix is just the sum of its diagonal 
entries. This gives a mapping Tr from dx d matrices to k. Using the trace, 
we define a symmetric bilinear form S by the rule: 

S{f,g) = Tr(m/ • rug) = Tr{nifg) 

(the last equality follows from part b of Proposition (4.2)). 

Exercise 2. 

a. Prove that S defined as above is a symmetric bilinear form on A, as 
claimed. That is, show that S is symmetric, meaning 5(/, g) = S{g, f) 
for all f,g e A, and linear in the first variable, meaning 

S{cfi + f 2 ,g) = cS{fi,g) + S(f 2 ,g) 

for all /i, / 2 ,^ G A and all c G fc. It follows that S is linear in the 
second variable as well. 
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b. Given a symmetric bilinear form 5 on a vector space V with basis 
{vi, . . . , Vd}, the matrix of S is the d x d matrix M = (S(vi, vj)). Show 
that the matrix of S with respect to the monomial basis B = 
for A is given by: 



Similarly, given the polynomial h G k[xi, . . . , Xn] used in the decompo- 
sition (5.1), we can construct a bilinear form 

Sh{f,g) = Tr{mhf ■ mg) = Tr{mhfg). 

Let Mh be the matrix of Sh with respect to B. 

Exercise 3. Show that Sh is also a symmetric bilinear form on A. What 
is the i, j entry of Mh^ 

Since we assume fc C M, the matrices M and Mh are symmetric matrices 
with real entries. It follows from the real spectral theorem (or principal axis 
theorem) of linear algebra that all of the eigenvalues of M and Mh will be 
real. For our purposes the exact values of these eigenvalues are much less 
important than their signs. 

Under a change of basis defined by an invertible matrix Q, the matrix 
M of a symmetric bilinear form S is taken to Q^MQ. There are two fun- 
damental invariants of S under such changes of basis — the signature a(5), 
which equals the difference between the number of positive eigenvalues and 
the number of negative eigenvalues of M, and the rank p(5), which equals 
the rank of the matrix M. (See, for instance. Chapter 6 of [Her] for more 
information on the signature and rank of bilinear forms.) 

We are now ready to state the main result of this section. 

(5.2) Theorem. Let I be a zero- dimensional ideal generated by polyno- 
mials in k[xi^ . . . ,Xn] (k C so that V(I) C is finite. Then, for 
h G k[xi , . . . , Xn]f the signature and rank of the bilinear form Sh satisfy: 

a{Sh) = #{a G V(7) n : h{a) > 0} - #{a G V(7) H : h{a) < 0} 
p{Sh) = #{a e V(/) : h{a) ^ 0}. 

Proof. This result is essentially a direct consequence of the reasoning 
leading up to Theorem (4.5) of this chapter. However, to give a full proof 
it is necessary to take into account the multiplicities of the points in V (/) 
in as solutions of the corresponding system of equations. Since we will 
not give the precise definition of the multiplicity of a point in V (/) until 
Chapter 4, we will only sketch the main ideas here. 

By Theorem (4.5), for any /, we know that the set of eigenvalues of m/ 
coincides with the set of values of the / at the points in V(J). The key new 
fact we will need is that using the structure of the algebra A, for each point 
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p in V(/) it is possible to define a positive integer m{p) (the multiplicity) 
so that rn{p) = d = dim(A), and so that {t — f{p))^^^^ is a factor of 
the characteristic polynomial of m/. (See §2 of Chapter 4 for the details.) 
By definition, the i,j entry of the matrix Mh is equal to 

The trace of the multiplication operator equals the sum of its eigenvalues. 
By the previous paragraph, the sum of these eigenvalues is 

(5.3) ^ 

P€V(/) 

where denotes the value of the monomial at the point p. List 
the points in V(/) as pi, . . . ,pd, where each point p in V(/) is repeated 
m(p) times consecutively. Let U be the d x d matrix whose jth column 
consists of the values pj for z = 1, . . . , d. Prom (5.3), we obtain a matrix 
factorization Mh = UDU^, where D is the diagonal matrix with entries 
h{pi ), . • . , h{pd). The equation for the rank follows since U is invertible. 
Both U and D may have nonreal entries. However, the equation for the 
signature follows from this factorization as well, using the facts that Mh has 
real entries and that the nonreal points in V(/) occur in complex conjugate 
pairs. We refer the reader to Theorem 2.1 of [PRS] for the details. □ 

The theorem may be used to determine how the real points in V (/) are 
distributed among the sets ^ H~ and Vr(/i) determined by h in (5.1). 
Theorem (5.2) implies that we can count the number of real points of 
V (/) in and in H~ as follows. The signature of Sh gives the difference 

between the number of solutions in if and the number in H ~ . By the same 
reasoning, computing the signature of 5/^2 we get the number of solutions 
in U if“, since > 0 at every point of U H~ . Prom this we can 
recover #V(7) fl and #V(7) fl H~ by simple arithmetic. Pinally, we 
need to find #V(7) fl Vr(/i), which is done in the following exercise. 

Exercise 4. Using the form S\ in addition to Sh and 5/^2, show that 
the three signatures or(5), (r{Sh), (^iSh^) give all the information needed to 
determine #V(7) n #V(7) H H~ and #V(7) H VR(h). 

Prom the discussion above, it might appear that we need to compute 
the eigenvalues of the forms Sh to count the numbers of solutions of the 
equations in and i7~, but the situation is actually much better than 
that Namely, the entire calculation can be done symbolically, so no recourse 
to numerical methods is needed. The reason is the following consequence 
of the classical Descartes Rule of Signs. 

(5.4) Proposition. Let Mh be the matrix of Shy cind let 

Ph{t) = det{Mh - tl) 
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be its characteristic polynomial. Then the number of positive eigenvalues of 
Sh is equal to the number of sign changes in the sequence of coefficients of 
Ph(t). (In counting sign changes, any zero coefficients are ignored.) 

Proof. See Proposition 2.8 of [PRS], or Exercise 5 below for a proof. □ 



For instance, consider the real symmetric matrix 

/3 1 5 4\ 

12 6 9 

"= 5 6 7 -1- 

\4 9 -1 0 / 

The characteristic polynomial of M is t^ — 12t^ — 119t^ + 1098t — 1251, 
giving three sign changes in the sequence of coefBcients. Thus M has three 
positive eigenvalues, as one can check. 



Exercise 5. The usual version of Descartes’ Rule of Signs asserts that the 
number of positive roots of a polynomial p{t) in R[t] equals the number of 
sign changes in its coefficient sequence minus a non-negative even integer. 

a. Using this, show that the number of negative roots equals the number 
of sign changes in the coefficient sequence of p{—t) minus another non- 
negative even integer. 

b. Deduce (5.4) from Descartes’ Rule of Signs, part a, and the fact that all 
eigenvalues of Mh are real. 

Using these ideas to find and isolate roots requires a good searching 
strategy. We will not consider such questions here. For an example showing 
how to certify the presence of exactly one root of a system in a given region, 
see Exercise 6 below. 

The problem of counting real solutions of polynomial systems in regions 
R C defined by several polynomial inequalities and/or equalities has 
been considered in general by Ben-Or, Kozen, and Reif (see, for instance, 
[BKR]). Using the signature calculations as above gives an approach which 
is very well suited to parallel computation, and whose complexity is rela- 
tively manageable. We refer the interested reader to [PRS] once again for 
a discussion of these issues. 



Additional Exercises for §5 

Exercise 6. In this exercise, you will verify that the equations 

0 = — 2xz + 5 

0 = xy"^ -\-yz-\-l 
0 = Sy^ — Sxz. 
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have exactly one real solution in the rectangular box 

R = {(x, y, z) e : 0 < X < —3<y< —2, 3 < z < 4}. 

a. Using grevlex monomial order with x > y > z, compute a Grobner 
basis G for the ideal I generated by the above equations. Also find the 
corresponding monomial basis B for C[x, y, z]/I. 

b. Implement the following Maple procedure get form which computes the 
matrix of the symmetric bilinear form Sh- 

getform := proc(h,B,G,VList ,t order) 

# computes matrix of the symmetric bilinear form S_h, with 

# respect to the monomial basis B for the quotient algebra 

# k [VList] /<G> . G is a Gr\”obner basis with respect to torder 

local d,M,i,j,p,q; 

with(linalg) : 
d:=nops(B) ; 

M := array (symmetric, 1. .d,l. .d) ; 
for i to d do 
for j from i to d do 

p : = normalf (h*B [i] *B [ j ] , G , VList , torder) ; 

M [i , j ] : =trace (getmatrix (p , B , G , VList , torder) ) ; 
od; 
od; 

RETURN (eval(M)) 
end: 

The call to getmatrix computes the matrix rrii^^cx{i ) with respect to 
the monomial basis B — for A. Coding getmatrix was Exercise 

18 in §4 of this chapter. 

c. Then, using 

h := x*(x~l) ; 

S := getform(h,B,G, [x,y,z] ,tdeg) ; 

compute the matrix of the bilinear form Sh ^oy h — x{x — 1). 

d. The actual entries of this 8x8 rational matrix are rather complicated 
and not very informative; we will omit reproducing them. Instead, use 

charpoly(evaKS) ,t) ; 

to compute the characteristic polynomial of the matrix. Your result 
should be a polynomial of the form: 

a2t^ + ast^ — 04 + ayt -f as, 

where each a^ is a positive rational number. 
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e. Use Proposition (5.4) to show that Sh has 4 positive eigenvalues. Since 
^8 ^ 0, ^ = 0 is not an eigenvalue. Explain why the other 4 eigenvalues 
are strictly negative, and conclude that Sh has signature 

a{Sh) =4-4 = 0. 

f. Use the second equation in Theorem (5.2) to show that h is nonvanishing 
on the real or complex points of V(/). Hint: Show that Sh has rank 8. 

g. Repeat the computation for h^: 

T := getform(h*h,B,G, [x,y,z] ,tdeg) ; 

and show that in the case, we get a second symmetric matrix with ex- 
actly 5 positive and 3 negative eigenvalues. Conclude that the signature 
of Sh 2 (which counts the total number of real solutions in this case) is 

a{Sh2) = 5-3 = 2. 

h. Using Theorem (5.2) and combining these two calculations, show that 

#V(/)nff+ = #V(/)n^- = 1, 

and conclude that there is exactly one real root between the two planes 
X = 0 and a: = 1 in E^. Our desired region R is contained in this infinite 
slab in R^. What can you say about the other real solution? 

i. Complete the exercise by applying Theorem (5.2) to polynomials in y 
and z chosen according to the definition of R. 

Exercise 7. Use the techniques of this section to determine the number 
of real solutions of 

0 = 2y^ — y — 2z 

0 = x^ -Sy^ + lOz - 1 
0 = x^ — 7yz 



in the box R = {(x, y, 2 ;) G R^ : 0 < a: < 1, 0 < i/ < 1, 0 < 2 ; < 1}. (This 
is the same system as in Exercise 6 of §1. Check your results using your 
previous work.) 

Exercise 8. The alternative real root isolation methods discussed in §8.8 
of [BW] are based on a result for real one- variable polynomials known as 
Sturm’s Theorem. Suppose p{t) G Q[^] is a polynomial with no multiple 
roots in C. Then GCD(p(t),p'(t)) = 1, and the sequence of polynomials 
produced by 



Po{t) = p(t) 

Pl{t) = p'{t) 

Pi(t) = -rem(pi-i(t),Pi-2it),t),i > 2 
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(so pi{t) is the negative of the remainder on division of Pi-i{t) by Pi-2{t) in 
Q[t]) will eventually reach a nonzero constant, and all subsequent terms will 
be zero. Let Pm{t) be the last nonzero term in the sequence. This sequence 
of polynomials is called the Sturm sequence associated to p{t). 

a. (Sturm’s Theorem) If a < 6 in R, and neither is a root of p{t) = 0, then 
show that the number of real roots of p{t) = 0 in the interval [a, 6] is 
the difference between the number of sign changes in the sequence of 
real numbers po(a)^pi{a ), . . . ,Prn(o) and the number of sign changes in 
the sequence Po{b),Pi{b)^ • • • ,Pm(^)* (Sign changes are counted in the 
same way as for Descartes’ Rule of Signs.) 

b. Give an algorithm based on part a that takes as input a polynomial 
p(t) £ Q[^] with no multiple roots in C, and produces as output a 
collection of intervals [ai,bi] in R, each of which contains exactly one 
root of p. Hint: Start with an interval guaranteed to contain all the 
real roots of p{t) = 0 (see Exercise 3 of §1, for instance) and bisect 
repeatedly, using Sturm’s Theorem on each subinterval. 




Chapter 3 

Resultants 



In Chapter 2, we saw how Grobner bases can be used in Elimination Theory. 
An alternate approach to the problem of elimination is given by resultants. 
The resultant of two polynomials is well known and is implemented in many 
computer algebra systems. In this chapter, we will review the properties 
of the resultant and explore its generalization to several polynomials in 
several variables. This multipolynomial resultant can be used to eliminate 
variables from three or more equations and, as we will see at the end of the 
chapter, it is a surprisingly powerful tool for finding solutions of equations. 



§1 The Resultant of Two Polynomials 



Given two polynomials /, ^ G k[x] of positive degree, say 

/ = aox^ + • • • -h a/, ao ^ 0, Z > 0 
g = box"^ H h 6m, 6o ^ 0, m > 0. 



Then the resultant of / and denoted Res(/, g), is the (Z -j- m) x (Z + m) 
determinant 



(1.2) Res(/,^) = det 
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where the blank spaces are filled with zeros. When we want to emphasize 
the dependence on x, we will write Res(/, x) instead of Res(/, ^). As a 
simple example, we have 

/I 0 2 0 0\ 

0 13 2 0 

(1.3) Res(a;^4-4x-l,2x2-f3a; + 7) = det 4 0 7 3 2 = 159. 

-1 4073 

\ 0 -1 0 0 7/ 

Exercise 1. Show that Res(/, g) = (— l)^’^Res(^, /). Hint: What happens 
when you interchange two columns of a determinant? 

Three basic properties of the resultant are: 

• (Integer Polynomial) Res(/, g) is an integer polynomial in the coefficients 
of / and g. 

• (Common Factor) Res(/, ^) = 0 if and only if / and g have a common 
factor in k[x]. 

• (Elimination) There are polynomials A,Be k[x] such that A f B g = 
Res{ f,g). The coefficients of A and B are integer polynomials in the 
coefficients of / and g. 

Proofs of these properties can be found in [CLO], Chapter 3, §5. The Integer 
Polynomial property says that there is a polynomial 

Resi^rn e Z[uo, . . . , U/, Vo, . . . , Vm] 

such that if /, g are as in (1.1), then 

HGs(y, p) Res^^77i(cio j • • • 5 • • • ? ^m)* 

Over the complex numbers, the Common Factor property tells us that 
/, p G C[x] have a common root if and only if their resultant is zero. Thus 

(1.3) shows that + x — 1 and 2x^ + 3x + 7 have no common roots in C 

since 159 ^ 0, even though we don’t know the roots themselves. 

To understand the Elimination property, we need to explain how resul- 
tants can be used to eliminate variables from systems of equations. As an 
example, consider the equations 

/ = xp - 1 = 0 
p = x^ + — 4 = 0. 

Here, we have two variables to work with, but if we regard / and p as 
polynomials in x whose coefficients are polynomials in p, we can compute 
the resultant with respect to x to obtain 

/ 2 / 0 1 \ 

Res(/, 3 , x) - det -1 y 0 1 = + 1. 

V 0 -1 j/2-4/ 
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By the Elimination property, there are polynomials A, B £ k[x^ y] with 
A • {xy - 1) + 5 • (a;^ + 2 /^ — 4) = 2 /^ — 4y^ -h 1. This means Res(/, g, x) 
is in the elimination ideal (/, g) fl k[y] as defined in §1 of Chapter 2, and it 
follows that y^ — 4y^ + 1 vanishes at any common solution of / = ^ = 0. 
Hence, by solving 2/^ — ^y^ -f- 1 = 0, we can find the 2 /-coordinates of the 
solutions. Thus resultants relate nicely to what we did in Chapter 2. 

Exercise 2. Use resultants to find all solutions of the above equations / = 
= 0. Also find the solutions using Res(/, 2/). In Maple, the command 
for resultant is resultant. 

More generally, if / and g are any polynomials in k[x, y] in which x 
appears to a positive power, then we can compute Res(/, g^x) in the same 
way. Since the coefficients are polynomials in y, the Integer Polynomial 
property guarantees that Res(/, g, x) is again a polynomial in y. Thus, we 
can use the resultant to eliminate a:, and as above, Res(/, g^ x) is in the 
elimination ideal (/, g) fl k[y] by the Elimination Property. For a further 
discussion of the connection between resultants and elimination theory, the 
reader should consult Chapter 3 of [CLO] or Chapter XI of [vdW]. 

One interesting aspect of the resultant is that it can be expressed in 
many different ways. For example, given f,g G k[x] as in (1.1), suppose 
their roots are ^i, . . . , and rji, , . . ,rjm respectively (note that these roots 
might lie in some bigger field). Then one can show that the resultant is 
given by 

Z m 

Res(/, g) = n - Vj) 

i=l j=l 

( 1 - 4 ) 

i=l 

m 

= n /(»?.)• 

j=i 

A proof of this is given in the exercises at the end of the section. 
Exercise 3. 

a. Show that the three products on the right hand side of (1.4) are all 
equal. Hint: g = bo{x - r/i) • • • (x - r/m). 

b. Use (1.4) to show that Res(/i/ 2 ,fi^) = Res(/i, <^)Res(/ 2 , fl^). 

The formulas given in (1.4) may seem hard to use since they involve the 
roots of / or g. But in fact there is a relatively simple way to compute 
the above products. For example, to understand the formula Res(/, g) = 
riz=i we will use the techniques of §2 of Chapter 2. Thus, consider 
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the quotient ring Af = k[x]/{f), and let the multiplication map rrig be 
defined by 

mg{[h]) = [ 5 ] ■ [/i] = [gh] e Af, 

where [h] £ Af is the coset of h £ k[x]. If we think in terms of remainders 
on division by /, then we can regard as consisting of all polynomials h 
of degree < Z, and under this interpretation, mg{h) is the remainder of gh 
on division by /. Then we can compute the resultant Res(/, g) in terms of 
rUg as follows. 

(1.5) Proposition. Res(f^g) = det{rrig : Af Af). 

Proof. Note that Af is a. vector space over k of dimension I (this is clear 
from the remainder interpretation of Af). Further, as explained in §2 of 
Chapter 2, rrig : Af Af is a linear map. Recall from linear algebra that 
the determinant det(m^) is defined to be the determinant of any matrix M 
representing the linear map rrig. Since M and rrig have the same eigenvalues, 
it follows that det{mg) is the product of the eigenvalues of rrig^ counted with 
multiplicity. 

In the special case when g{^i), . . . , g{^i) are distinct, we can prove our 
result using the theory of Chapter 2. Namely, since {Ci, • . • , C/} = V(/), it 
follows from Theorem (4.5) of Chapter 2 that the numbers g{^i)\ . . . , g{^i) 
are the eigenvalues of rrig. Since these are distinct and Af has dimension 
Z, it follows that the eigenvalues have multiplicity one, so that det(m^) = 
^(^i) • * * 5^(6)? ^ desired. The general case will be covered in the exercises 
at the end of the section. □ 

Exercise 4. For f = x^ -\-x — 1 and g = 2x^ + 3x -I- 7 as in (1.3), use the 
basis {1, X, of A f (thinking of A f in terms of remainders) to show 

/7 2 3\ 

Res(/, ^) = l^det(mg) = det j 3 5 — 1 j = 159. 

V 2 3 5 / 



Note that the 3x3 determinant in this example is smaller than the 5x5 
determinant required by the definition (1.2). In general. Proposition (1.5) 
tells us that Res(/, g) can be represented as an Z x Z determinant, while the 
definition of resultant uses an (Z + m) x (Z -f- m) matrix. The getmatrix 
procedure from Exercise 7 of Chapter 2, §2 can be used to construct the 
the smaller matrix. Also, by interchanging / and g^ we can represent the 
resultant using an m x m determinant. 

For the final topic of this section, we will discuss a variation on Res(/, g) 
which will be important for §2. Namely, instead of using polynomials in the 
single variable x, we could instead work with homogenous polynomials in 
variables x, y. Recall that a polynomial is homogeneous if every term has 
the same total degree. Thus, if F, G G fc[x, y] are homogeneous polynomials 
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of total degrees Z, m respectively, then we can write 

F = aox^ + aix^~^y H h aiy^ 

G = box^ H" bix^ ^2/ -f * • • + bmy^ • 

Note that ao or bo (or both) might be zero. Then we define Res(F, G) e k 
using the same determinant as in (1.2). 

Exercise 5. Show that Res{x\y'^) = 1. 

If we homogenize the polynomials / and g of (1.1) using appropriate 
powers of y, then we get F and G as in (1.6). In this case, it is obvious that 
Res(/, g) = Res(F, G). However, going the other way is a bit more subtle, 
for if F and G are given by (1.6), then we can dehomogenize by setting 
y = 1, but we might fail to get polynomials of the proper degrees since ao 
or bo might be zero. Nevertheless, the resultant Res(F, G) still satisfies the 
following basic properties. 

(1.7) Proposition. Fix positive integers I and m. 

a. There is a polynomial Resi^rn ^ . . . , a/, 6o> • • • ? bm] such that 

Res(JF', G) = Res/^772(ao, • • • ^a>i^ 6q, . . • > b>)ji) 
for all F, G as in (L6). 

b. Over the field of complex numbers^ Res(F, G) = 0 if and only if the 
equations F = G = 0 have a solution (x,y) ^ (0,0) in (this is 
called a nontrivial solution). 

Proof. The first statement is an obvious consequence of the determinant 
formula for the resultant. As for the second, first observe that if (u, v) G 
is a nontrivial solution, then so is (Au, \v) for any nonzero complex number 
A. We now break up the proof into three cases. 

First, if ao = &o = 0, then note that the resultant vanishes and that we 
have the nontrivial solution (x, y) = (1,0). Next, suppose that ao ^ 0 and 
6o 7 ^ 0. If Res(F, G) = 0, then, when we dehomogenize by setting y = 1, we 
get polynomials /, y G C[x] with Res(/, g) = 0. Since we’re working over 
the complex numbers, the Common Factor property implies / and g must 
have a common root x = u, and then (x, y) = (u, 1) is the desired nontrivial 
solution. Going the other way, if we have a nontrival solution (a, -y), then 
our assumption ao6o 7^ 0 implies that i; ^ 0. Then (u/v, 1) is also a 
solution, which means that u/v is a common root of the dehomogenized 
polynomials. From here, it follows easily that Res(F, G) = 0. 

The final case is when exactly one of ao, bo is zero. The argument is a 
bit more complicated and will be covered in the exercises at the end of the 
section. □ 
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We should also mention that many other properties of the resultant, 
along with proofs, are contained in Chapter 12 of [GKZ]. 



Additional Exercises for §1 



Exercise 6. As an example of how resultants can be used to eliminate 
variables from equations, consider the parametric equations 

X = l-fs + t + st 
z — s 1 

Our goal is to eliminate s, t from these equations to find an equation 
involving only x, y, 

a. Use Grobner basis methods to find the desired equation in x, y, 2;. 

b. Use resultants to find the desired equations. Hint: Let / = l + s + ^4- 

st — X, g = 2 s st -{■ — y and h = s + t-\-s‘^ — z. Then eliminate 

t by computing Res(/, y, t) and Res(/, ft, t). Now what resultant do you 
use to get rid of s? 

c. How are the answers to parts a and b related? 



Exercise 7. Let /, y be as in (1.1). If we divide y by /, we get y = 9 / + r, 
where deg(r) < deg(y) = m. Then, assuming that r is nonconstant, show 
that 



Res(f,g) = o™ r). 



Hint: Let yi = y — {bo/ao)x'^~^f and use column operations to subtract 
bo/ao times the first I columns in the / part of the matrix from the columns 
in the y part. Expanding repeatedly along the first row gives Res(/, y) = 
Continue this process to obtain the desired formula. 



Exercise 8. Our definition of Res(/, y) requires that /, y have positive 
degrees. Here is what to do when / or y is constant. 

a. If deg(/) > 0 but y is a nonzero constant 60 ? show that the determinant 
(1.2) still makes sense and gives Res(/, 60) = b^, 

b. Ifdeg(y) > 0 and ao ^ 0, what is Res(ao, y)? Also, what is Res(ao, 60)? 
What about Res(/, 0) or Res(0, y)? 

c. Exercise 7 assumes that the remainder r has positive degree. Show that 
the formula of Exercise 7 remains true even if r is constant. 



Exercise 9. By Exercises 1, 7 and 8, resultants have the following three 
properties: Res(/,y) = (-l)^""Res(y, /); Res(/,6o) = b\^] and Res(/,y) = 
^m-deg(r) when y = y / + T. LFse these properties to describe an 

algorithm for computing resultants. Hint: Your answer should be similar 
to the Euclidean algorithm. 
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Exercise 10 . This exercise will give a proof of (1.4). 

a. Given f,g SiS usual, define res(/, g) = HLi where ^i, . . . , 
are the roots of /. Then show that res(/, g) has the three properties of 
resultants mentioned in Exercise 9. 

b. Show that the algorithm for computing res(/, p) is the same as the 
algorithm for computing Res(/, g), and conclude that the two are equal 
for all /, g. 

Exercise 11 . Let / = aox^ -h aix^~^ 4- • • • + a/ G k[x] be a polynomial 
with ao / 0, and let A f = k[x]/{f). Given g G k[x]^ let rUg : Af Af he 
multiplication by g. 

a. Use the basis {1, x, . . . , of Af (so we are thinking of Af as 

consisting of remainders) to show that the matrix of rrix is 
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This matrix (or more commonly, its transpose) is called the companion 
matrix of /. 

b. If ^ = box^ + • * * H- then explain why the matrix of nig is given by 

g{Cf) =hCf + hiCf-^ + • • • + bml, 

where I is the I x I identity matrix. Hint: By Proposition (2.4) of 
Chapter 2, the map sending g G k[x] to mg G Mixi{k) is a ring 
homomorphism. 

c. Conclude that Res(/, g) — a^ det{g{Cf)). 

Exercise 12. In Proposition (1.5), we interpreted Res(f^g) as the de- 
terminant of a linear map. It turns out that the original definition (1.2) 
of resultant has a similar interpretation. Let Pn denote the vector space 
of polynomials of degree < n. Since such a polynomial can be written 
aox^ 4- • • • + it follows that {x^^ . . . , 1} is a basis of Pn- 

a. Given /, ^ as in (1.1), show that if (^4, B) G Pm-i 0P/-i, then Af + B g 
is in P/+m-i. Conclude that we get a linear map ^f^g : Pm-i 0 P/-i — ^ 

^l-\-TTl — 1 • 

b. If we use the bases . . . , 1} of Pm-u . . . , 1} of Pi-i and 

{^z+m-i, . . . ^ 1} of Pz-|_m-i, show that the matrix of the linear map 
^f^g from part a is exactly the matrix used in (1.2). Thus, Res(/, y) = 
det($/,^), provided we use the above bases. 

c. If Res(/, g) ^ 0, conclude that every polynomial of degree < Z 4- m — 1 
can be written uniquely as A f-\-B g where deg(A) < m and deg(P) < 1. 
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Exercise 13. In the text, we only proved Proposition (1.5) in the special 
case when , g{^i) are distinct. For the general case, suppose / = 

ao{x — • • • (x — where ^i, . . . , are distinct. Then we want to 

prove that det(m^) = ni=i 

a. First, suppose that f = {x — In this case, we can use basis of .4/ 
given by {{x — ^)^~^, . . . , x — 1} (as usual, we think of yl/ as consisting 
of remainders). Then show that the matrix of rrig with respect to the 
above basis is upper triangular with diagonal entries all equal to p(^). 
Conclude that det(m^) = p(0“- Hint: Write g = box^ + • • • -f 6rn in 

the form g = cq{x - H h Cm-i{x - 0 + by replacing x with 

(x — $) + ^ and using the binomial theorem. Then let x = ^ to get 

Cm = 9(0- 

b. In general, when / = ao(x — ^i)“^ • • • (x — ^r)“’”> show that there is a 
well defined map 

' (^N/((a; - 6)“')) © • • • © m.x\/{{x - Cr)"")) 

which preserves sums and products. Hint: This is where working with 
cosets is a help. It is easy to show that the map sending [h\ ^ Aj to 
[h] G fc[x]/((x — ^i)“0 is well-defined since (x — divides /. 

c. Show that the map of part b is a ring isomorphism. Hint: First show 
that the map is one-to-one, and then use linear algebra and a dimension 
count to show it is onto. 

d. By considering multiplication by g on 

(fc[x]/((a; - 6)“0) © • • • © (fcN/((x - Cr)®")) 
and using part a, conclude that det(m^) = 111=1 ^ desired. 

Exercise 14. This exercise will complete the proof of Proposition (1.7). 
Suppose that F, G are given by (1.6) and assume ao ^ 0 and 6o = • • * = 
br-i = 0 but br ^ 0. If we dehomogenize by setting y = 1, we get 
polynomials /, g of degree l,m — r respectively. 

a. Show that Res(F, G) — aoRes(/, g). 

b. Show that Res(F, G) = 0 if and only F = G = 0 has a nontrivial 
solution. Hint: Modify the argument given in the text for the case when 
tto and bo were both nonzero. 



§2 Multipolynomial Resultants 

In §1, we studied the resultant of two homogeneous polynomials F, G in 
variables x, y. Generalizing this, suppose we are given n + 1 homogeneous 
polynomials Fq, . . . , F^ in variables xq, . . . , Xn, and assume that each Fi 
has positive total degree. Then we get n + 1 equations in n + 1 unknowns: 

(2.1) Fq(xo, • • • } ^n) * * * -^n(^O) • • • j ^n) fi* 
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Because the Fi are homogeneous of positive total degree, these equations 
always have the solution a;o = • • • = = 0, which we call the trivial solu- 

tion. Hence, the crucial question is whether there is a nontrivial solution. 
For the rest of this chapter, we will work over the complex numbers, so 
that a nontrivial solution will be a point in \ {(0, . . . , 0)}. 

In general, the existence of a nontrivial solution depends on the coef- 
ficients of the polynomials Fq, . . . , for most values of the coefficients, 
there are no nontrivial solutions, while for certain special values, they exist. 

One example where this is easy to see is when the polynomials Fi are all 
linear, i.e., have total degree 1. Since they are homogeneous, the equations 

(2.1) can be written in the form: 

Fo = Coo^^o H h COnXn = 0 

( 2 . 2 ) : 



Fji — ^77,0^0 -f- • • • -f- CfiYiXn — 0. 

This is an (n -f- 1) x (n + 1) system of linear equations, so that by linear 
algebra, there is a nontrivial solution if and only if the determinant of the 
coefficient matrix vanishes. Thus we get the single condition det (cij) = 0 
for the existence of a nontrivial solution. Note that this determinant is a 
polynomial in the coefficients Cij. 

Exercise 1. There was a single condition for a nontrivial solution of (2.2) 
because the number of equations (n -h 1) equaled the number of unknowns 
(also n + 1). When these numbers are different, here is what can happen. 

a. If we have r < n + 1 linear equations in n -h 1 unknowns, explain why 
there is always a nontrivial solution, no matter what the coefficients are. 

b. When we have r > n + 1 linear equations in n + 1 unknowns, things 
are more complicated. For example, show that the equations 

Fo = coox -h coiy = 0 
Fi = cioa: + cuy = 0 
F2 = C20X + C2iy = 0 

have a nontrivial solution if and only if the three conditions 




are satisfied. 

In general, when we have n -f- 1 homogeneous polynomials Fq, . . . , F^^ G 
C[a;o, . . . , Xn], we get the following Basic Question: What conditions must 
the coefficents 0 / Fq, . . . , F^ satisfy in order that Fq = • • • = F„ = 0 has 
a nontrivial solution? To state the answer precisely, we need to introduce 
some notation. Suppose that di is the total degree of F^, so that Fi can be 
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written 

\a\=di 

For each possible pair of indices a, we introduce a variable Then, 
given a polynomial P G C[ui^a]^ we let P(Fq, . . . , Fn) denote the number 
obtained by replacing each variable Ui^a in P with the corresponding coef- 
ficient Ci,a- This is what we mean by a polynomial in the coefficients of the 
Fi. We can now answer our Basic Question. 

(2.3) Theorem. If we fix positive degrees do,...,dn; then there is a 
unique polynomial Res G Z[ui^a] which has the following properties: 

a. // Fo, . . . , Fn ^ C[xi, . . . , Xn] are homogeneous of degrees d^^. . , ^dny 
then the equations (2.1) have a nontrivial solution over C if and only if 
Res(Fo, . . . ,Fn) = 0. 

b. Res(xo“, . . . ,x^") = 1. 

c. Res is irreducible, even when regarded as a polynomial in C[ui^a\- 

Proof. A complete proof of the existence of the resultant is beyond the 
scope of this book. See Chapter 13 of [GKZ] or §78 of [vdW] for proofs. 
At the end of this section, we will indicate some of the intuition behind 
the proof when we discuss the geometry of the resultant. The question of 
uniqueness will be considered in Exercise 5. □ 

We call Res(Fo, . . . , Fn) the resultant of Fq, . . . , Fn. Sometimes we write 
Resdo,...,dn instead of Res if we want to make the dependence on the degrees 
more explicit. In this notation, if each Fi = linear, then 

discussion following (2.2) shows that 

Resi„..,i(Fo, ...,F„) = det(cjj). 

Another example is the resultant of two polynomials, which was discussed in 
§1. In this case, we know that Res(Fo, Fi) is given by the determinant (1.2). 
Theorem (2.3) tells us that this determinant is an irreducible polynomial 
in the coefficients of Fq, Fi. 

Before giving further examples of multipolynomial resultants, we want to 
indicate their usefulness in applications. Let’s consider the implicitization 
problem, which asks for the equation of a parametric curve or surface. For 
concreteness, suppose a surface is given parametrically by the equations 

X = f{s, t) 

(2.4) y = g{s, t) 

z = h{s, t), 

where f{s, t), g{s, t), h{s, t) are polynomials (not necessarily homogeneous) 
of total degrees do, di,d 2 . There are several methods to find the equation 
p{x, y,z) = 0 of the surface described by (2.4). For example. Chapter 3 of 
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[CLO] uses Grobner bases for this purpose. We claim that in many cases, 
multipolynomial resultants can be used to find the equation of the surface. 

To use our methods, we need homogeneous polynomials, and hence we 
will homogenize the above equations with respect to a third variable u. For 
example, if we write /(s, t) in the form 

/(s. i) = fdois, t) + fdo-l{s, t)-\ + fo{s, t), 

where fj is homogeneous of total degree j in s, t, then we get 
F{s, t, u) = fdois, t) + fdo-l{s, t)u-{ 1 - fo{s, 

which is now homogeneous in s,t, u of total degree do- Similarly, g{s,t) 
and h(s, t) homogenize to G(s, t, u) and u)^ and the equations (2.4) 

become 

(2.5) F(s, t, u) — = G( 5 , t, u) — = H{s, u) — = 0 . 

Note that x, y, z are regarded as coefficients in these equations. 

We can now solve the implicitization problem for (2.4) as follows. 

(2.6) Proposition. With the above notation^ assume that the system of 
homogeneous equations 

fdois, t) = gd^{s,t) = hd^{s,t) = 0 

has only the trivial solution. Then, for a given triple {x,y,z) G the 
equations ( 2 . 4 ) have a solution (s, t) G €? if and only if 

Resdo,di,d2i^ - G - zu^"^) = 0 . 

Proof. By Theorem (2.3), the resultant vanishes if and only if (2.5) has 
a nontrivial solution {s,t,u). If 7 ^ 0, then {s/u,t/u) is a solution to 
(2.4). However, if = 0, then (s, t) is a nontrivial solution of fdois, t) = 
gdiis,t) = hd 2 is,t) = 0, which contradicts our hypothesis. Hence, u = 0 
can’t occur. Going the other way, note that a solution (s, t) of (2.4) gives 
the nontrivial solution (s, t, 1) of (2.5). □ 

Since the resultant is a polynomial in the coefficients, it follows that 

(2.7) p{x, y, z) = Resdo,dud 2 (F ~ xu'^,G - yu^\H - 

is a polynomial in x,y,z which, by Proposition (2.6), vanishes precisely 
on the image of the parametrization. In particular, this means that the 
parametrizaton covers all of the surface p(x, y, z) = 0, which is not 
true for all polynomial parametrizations — the hypothesis that fdois,t) = 
gdiis,t) = ^^ 2 (^ 5 1 ) = 0 has only the trivial solution is important here. 

Exercise 2. 

a. If fdois, t) = gdiis,t) = hd 2 is,t) = 0 has a nontrivial solution, show 
that the resultant (2.7) vanishes identically. Hint: Show that (2.5) always 
has a nontrivial solution, no matter what x, y, z are. 
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b. Show that the parametric equations (x, y, z) = (st, sH^ si^) define the 
surface = yz. By part a, we know that the resultant (2.7) can’t be 
used to find this equation. Show that in this case, it is also true that 
the parametrization is not onto — there are points on the surface which 
don’t come from any s, t. 

We should point that for some systems of equations, such as 

X = 1 s 1 st 
y = 2 s 3t st 
z = s — t sty 

the resultant (2.7) vanishes identically by Exercise 2, yet a resultant can 
still be defined — ^this is one of the sparse resultants which we will consider 
in Chapter 7. 

One difficulty with multipolynomial resultants is that they tend to be 
very large expressions. For example, consider the system of equations given 
by 3 quadratic forms in 3 variables: 

Fo = coix'^ + co2y^ + -f CQ^xy + cq^xz + coeyz = 0 

Fi = Ciix^ + Ci22/^ + C13Z^ 4- Ci4xy + ci^xz -f cieyz = 0 

F2 = C2lX^ H- C22y^ + + C24Xy + C25XZ + C26yz = 0. 

Classically, this is a system of “three ternary quadrics”. By Theorem (2.3), 

the resultant Res 2 , 2 , 2 (^ 0 j ^ 2 ) vanishes exactly when this system has a 

nontrivial solution in x^y^z. 

The polynomial Res 2 , 2,2 is very large: it has 18 variables (one for each 
coefficient c^j), and the theory of §3 will tell us that it has total degree 
12. Written out in its full glory, Res 2 , 2,2 has 21,894 terms (we are grateful 
to Bernd Sturmfels for this computation). Hence, to work effectively with 
this resultant, we need to learn some more compact ways of representing 
it. We will study this topic in more detail in §3 and §4, but to whet the 
reader’s appetite, we will now give one of the many interesting formulas for 
RgS2,2,2* 

First, let J denote the Jacobian determinant of Fq, Fi, F 2 : 

OFq dFp dFp > 

dx dy dz 

dFi dFi dFi 

dx dy dz ’ 
dF2 dF2 dF2 

dx dy dz j 

which is a cubic homogeneous polynomial in x, y, 0 . This means that the 
partial derivatives of J are quadratic and hence can be written in the 
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following form: 



— = boix'^ + bo2V^ + + bo^xy + bo^xz + bo^yz 

dJ 

— = 6iix^ + buy^ + bisz'^ + b^xy + bisxz + bieyz 
oy 

^ = b2lX^ + b22y^ + b23Z^ + b2iXy + b23XZ + 6262 / 2 - 
oz 



Note that each bij is a cubic polynomial in the Cij. Then, by a classical for- 
mula of Salmon (see [Sal], Art. 90), the resultant of three ternary quadrics 
is given by the 6x6 determinant 



( 2 . 8 ) 



ReS2,2,2(^0> ^2) = 




/ 0)1 


0)2 


Cos 


Cll 


C 12 


Cl3 


C 21 


C 22 


C2S 


boi 


^>02 


bos 


bu 


^>12 


bi3 


^ b2i 


to 


b23 



0)4 


0)5 


CO 6 ^ 


Ci4 


Cl5 


CI 6 


C 24 


C25 


C26 


bo4 


605 


bo6 


bi4 


^15 


bio 


b24 


^25 


b26 / 



Exercise 3. 

a. Use (2.8) to explain why Res2,2,2 has total degree 12 in the variables 

Coij • • • j C26- 

b. Why is the fraction —1/512 needed in (2.8)? Hint: Compute the 
resultant Res2,2,2(a^^j 2/^? ^^)- 

c. Use (2.7) and (2.8) to find the equation of the surface defined by the 
equations 



x = l-hs-\-t + st 
2/ = 2-hs + st + t^ 
z = s + s^. 

Note that st = st = 0 has only the trivial solution, so that 

Proposition (2.6) applies. You should compare your answer to Exercise 6 
of§l. 

In §4 we will study the general question of how to find a formula for a 
given resultant. Here is an example which illustrates one of the methods 
we will use. Consider the following system of three homogeneous equations 
in three variables: 

Fo = aix + a2V + asz = 0 
(2.9) Fi = bix + 622/ -\-bsz = 0 

F 2 = cix^ + C22/^ + csz^ + C4xy + c^xz 4- c^yz = 0. 

Since Fq and F\ are linear and F 2 is quadratic, the resultant involved is 
Resi,i,2(i^0j Fi, F 2 ). We get the following formula for this resultant. 
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(2.10) Proposition. Resi,i, 2 (^b? F 2 ) is given by the polynomial 

- ^ 162 ^ 3^6 + 01 ^ 3^2 - 2 aia 2 bib 2 Cs + aia 2 bib 3 Ce 
+ aia2b2bsC3 - aia2blc4 + 01036162^6 - 201O36163C2 — 01036^05 
+ O 1 O 36263 C 4 + O 261 C 3 - O 26163 C 5 + O 263 C 1 - 02 O 361 C 6 

+ O2O36162C5 + O2O36163C4 — 2O2O36263C1 “h o|6iC2 — O36162C4 + O362C1. 

Proof. Let R denote the above polynomial, and suppose we have a non- 
trivial solution (x, z) of (2.9). We will first show that this forces a slight 
variant of R to vanish. Namely, consider the six equations 

(2.11) X • Fo = y • Fq = z • Fo = y ' Fi = z • Fi = 1 • F 2 = 0, 



which we can write as 



aix^ 


+ 


0 


+ 


0 




02xy 


+ 


asxz 


-f 


0 


= 0 


0 


+ 


a2V^ 


+ 


0 


+ 


aixy 


+ 


0 


+ 


azyz 


= 0 


0 




0 


+ 


aaz^ 


+ 


0 


+ 


a\xz 




02yz 


= 0 


0 


+ 


b2V^ 


-f 


0 




b\xy 




0 


+ 


bzyz 


= 0 


0 


+ 


0 


+ 




+ 


0 


+ 


61 x 2 : 


+ 


b2yz 


= 0 


CiX^ 


+ 


C2V^ 


+ 


CsZ^ 


+ 


C4xy 




C3XZ 




ceyz 


= 0. 


If we regard 






xy, xz, 


yz 


as “unknowns”, then this system of six 



linear equations has a nontrivial solution, which implies that the determi- 
nant D of its coefficient matrix is zero. Using a computer, one easily checks 
that the determinant is D = —aiR. 

Thinking geometrically, we have proved that in the 12 dimensional space 
with ai, . . . , C 6 as coordinates, the polynomial D vanishes on the set 

(2.12) {(®i» • • • ) ce) : (2.9) has a nontrivial solution} C 

However, by Theorem (2.3), having a nontrival solution is equivalent to the 
vanishing of the resultant, so that D vanishes on the set 

V(Resi,i, 2 ) C 

This means that G I(V(Resi,i, 2 )) = i/(Resi,i, 2 ), where the last equality 
is by the Nullstellensatz (se e §4 of Ch apter 1). But Resi,i ,2 is irreducible, 
which easily implies that y^(Resi,i, 2 } = (Resi,i, 2 )- This proves that D G 
(Resi,i, 2 ), so that D = —aiR is a multiple of Resi,i, 2 - Irreducibility then 
implies that Resi,i ,2 divides either ai or R. The results of §3 will tell us 
that Resi,i ,2 has total degree 5. It follows that Resi,i ,2 divides i2, and since 
R also has total degree 5, it must be a constant multiple of Resi,i^ 2 - By 
computing the value of each when {Fq, Fi,F 2 ) = {x, y, z‘^)^ we see that the 
constant must be 1, which proves that R = Resi,i, 2 j as desired. □ 

Exercise 4. Verify that R = 1 when (Fq? Fi, F 2 ) = (x, y, z^). 

The equations (2.11) may seem somewhat unmotivated. In §4 we will see 
that there is a systematic reason for chosing these equations. 
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The final topic of this section is the geometric interpretation of the resul- 
tant. We will use the same framework as in Theorem (2.3). This means that 
we consider homogeneous polynomials of degree do, . . . , dn, and for each 
monomial of degree d^, we introduce a variable Ui^a- Let M be the total 
number of these variables, so that is an affine space with coordinates 
Ui^a for all 0 < i < n and |a| = d^ A point of will be written (ci^a)- 
Then consider the “universal” polynomials 

Fj = ^ i = 0,...,n. 

\a\=di 

Note that the coefficients of the are the variables Ui^a- If we evaluate 
Fo, . . . , Fn at ^ we get the polynomials Fq, . . . , Fn, where Fi = 
^\a\=di Thus, we Can think of points of as parametrizing all 

possible {n l)-tuples of homogeneous polynomials of degrees do, ... ^dn- 
To keep track of nontrivial solutions of these polynomials, we will use 
projective space P’^(C), which we write as for short. Recall the following: 

• A point in P’^ has homogeneous coordinates (ao, . . . , an), where a^ G C 
are not all zero, and another set of coordinates {bo, ... ,bn) gives the 
same point in P’^ if and only if there is a complex number X ^ 0 such 
that (6 q, • • • , ^n) ” ^(ao, • • • , ^n)* 

• If F{xo, . . . , Xn) is homogeneous of degree d and {bo, . . . ,bn) — 
A(ao, . . . , ttn) are two sets of homogeneous coordinates for some point 
p G P^, then 

F(6o , . . . ,bn) X^F{ao , . . . , Un). 

Thus, we can’t define the value of F at p, but the equation F(p) = 0 
makes perfect sense. Hence we get the projective variety V(F) C P’^, 
which is the set of points of P’^ where F vanishes. 

For a homogeneous polynomial F, notice that V(F) C P’^ is determined 
by the nontrivial solutions of F = 0. For more on projective space, see 
Chapter 8 of [CLO]. 

Now consider the product x P^. A point ao? • • • ? an) € x P’^ 
can be regarded as n + 1 homogenegous polynomials and a point of P^. The 
“universal” polynomials F^ are actually polynomials on x P’^, which 
gives the subset W = V(Fq, . . . , Fn). Concretely, this set is given by 

W — {{ci^cx, oq, • • • , an) G C X P . (ao, • • • , an) is a 
nontrivial solution of Fq = ••• = Fn = 0, where 
Fo, . . . , Fn are determined by (ci,^)} 

= {all possible pairs consisting of a set of equations 
Fo = ••• = Fn = 0 of degrees do, . . . , dn and 
a nontrivial solution of the equations}. 
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Now comes the interesting part: there is a natural projection map 

7T : X 

defined by 7r(ci,a, • • • ? ^n) = (Q,a), and under this projection, the 
variety W C x P’^ maps to 

7t(W) = {{Ci^a) ^ : there is (ao, . . . , an) G P’^ 

such that (ci,a, ao, • . . , an) G W} 

= {all possible sets of equations Fq = • • • = Fn = 0 of 
degrees di, . . . , dn which have a nontrivial solution}. 

Note that when the degrees are (do, di, (I 2 ) = (1, 1, 2), 7t{W) is as in (2.12). 

The essential content of Theorem (2.3) is that the set n{W) is defined 
by the single irreducible equation ResdQ,...,dn = 0* To prove this, first note 
that 7 t{W) is a variety in by the following result of elimination theory. 

• (Projective Extension Theorem) Given a variety W C x P^ and the 

projection map tt : x P^ — > C^, the image 7t{W) is a variety in C^. 

(See, for example, §5 of Chapter 8 of [CLO].) This is one of the key reasons 
we work with projective space (the corresponding assertion for aSine space 
is false in general). Hence 7t{W) is defined by the vanishing of certain 
polynomials on C^. In other words, the existence of a nontrivial solution 
of Fo = • • • = Fn = 0 is determined by polynomial conditions on the 
coefficients of Fq, . . . , F^. 

The second step in the proof is to show that we need only one polynomial 
and that this polynomial is irreducible. Here, a rigorous proof requires 
knowing certain facts about the dimension and irreducible components of 
a variety (see, for example, [Sha], §6 of Chapter I). If we accept an intuitive 
idea of dimension, then the basic idea is to show that the variety 7t{W) C 
is irreducible (can’t be decomposed into smaller pieces which are still 
varieties) of dimension M — 1. In this case, the theory will tell us that 7r{W) 
must be defined by exactly one irreducible equation, which is the resultant 
Resdo,...,dn = 0. 

To prove this, first note that x P’^ has dimension M + n. Then 
observe that W C x P^ is defined by the n + 1 equations Fq = • • • = 
Fn = 0. Intuitively, each equation drops the dimension by one, though 
strictly speaking, this requires that the equations be “independent” in an 
appropriate sense. In our particular case, this is true because each equation 
involves a disjoint set of coefficient variables Ui^a- Thus the dimension of 
W is (M + n) — (n -|- 1) = M — 1. One can also show that W is irreducible 
(see Exercise 9 below). From here, standard arguments imply that 7r(kF) 
is irreducible. The final part of the argument is to show that the map 
W — > 7 t{W) is one-to-one “most of the time”. Here, the idea is that if 
Fo = • • • = Fn = 0 do happen to have a nontrivial solution, then this 
solution is usually unique (up to a scalar multiple). For the special case 
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when all of the Fi are linear, we will prove this in Exercise 10 below. For the 
general case, see Proposition 3.1 of Chapter 3 of [GKZ]. Since W — > 7t{W) 
is onto and one-to-one most of the time, 'ir{W) also has dimension M — 1. 



Additional Exercises for §2 

Exercise 5. To prove the uniqueness of the resultant, suppose there are 
two polynomials Res and Res' satisfing the conditions of Theorem (2.3). 

a. Adapt the argument used in the proof of Proposition (2.10) to show that 
Res divides Res' and Res' divides Res. Note that this uses conditions a 
and c of the theorem. 

b. Now use condition b of Theorem (2.3) to conclude that Res = Res'. 

Exercise 6. A homogeneous polynomial in C[x] is written in the form 
ax^. Show that ReSd{ax^) = a. Hint: Use Exercise 5. 

Exercise 7. When the hypotheses of Proposition (2.6) are satisfied, the 
resultant (2.7) gives a polynomial p{x^ y, z) which vanishes precisely on the 
parametrized surface. However, p need not have the smallest possible total 
degree: it can happen that p = for some polynomial q of smaller total 
degree. For example, consider the (fairly silly) parametrization given by 
(x, y, z) = (s, s, t^). Use the formula of Proposition (2.10) to show that in 
this case, p is the square of another polynomial. 

Exercise 8. The method used in the proof of Proposition (2.10) can be 
used to explain how the determinant (1.2) arises from nontrival solutions 
F = G — 0, where F, G are as in (1.6). Namely, if (x, y) is a nontrivial 
solution of (1.6), then consider the I -h m equations 

^m-l . JET = 0 

x^-^y . F = 0 

ym-l , JP 

. G = 0 
x^-^y • G = 0 

y^-i . G = 0. 

Regarding this as a system of linear equations in unknowns 

. . . , show that coefficient matrix is exactly the trans- 

pose of (1.2), and conclude that the determinant of this matrix must vanish 
whenever (1.6) has a nontrivial solution. 
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Exercise 9. In this exercise, we will give a rigorous proof that the set W 
from (2.13) is irreducible of dimension M — 1. For convenience, we will 
write a point of as (Fb, . . . , F^). 

a. If p = (ao, . . . , an) are fixed homogeneous coordinates for a point 

p e P^, show that the map defined by (Fq, . . . , Fn) •-> 

(Fo(p), . . . , Fn(p)) is linear and onto. Conclude that the kernel of this 
map has dimension M — n — 1. Denote this kernel by K{p), 

b. Besides the projection tt : x P^ — > used in the text, we also 

have a projection map x P*^ — > P’^, which is projection on the second 
factor. If we restrict this map to IF, we get a map tt : IF — ^ P’^ defined 
by #(Fo, . . . , Fn,p) = p. Then show that 

7f-^(p) = K(p) X {p}, 

where as usual ^“^(p) is the inverse image of p G P’^ under i.e., the 
set of all points of IF which map to p under In particular, this shows 
that TT : IF — > P’^ is onto and that all inverse images of points are 
irreducible (being linear subspaces) of the same dimension. 

c. Use Theorem 8 of [Sha], §6 of Chapter 1, to conclude that IF is 
irreducible. 

d. Use Theorem 7 of [Sha], §6 of Chapter 1, to conclude that IF has di- 
mension M — 1 = n (dimension ofP”^)-|-M — n— 1 (dimension of the 
inverse images). 

Exercise 10. In this exercise, we will show that the map IF — > 7 t(IF) is 
usually one-to-one in the special case when Fq, . . . , F^ have degree 1. Here, 
we know that if Fi = then Res(Fo, . . . , Fn) = det(A), where 

A = {cij). Note that A is a (n + 1) x (n -f- 1) matrix. 

a. Show that Fq = • • • = F^ = 0 has a nontrivial solution if and only if A 
has rank < n + 1. 

b. If A has rank n, prove that there is a unique nontrivial solution (up to 
a scalar multiple). 

c. Given 0 < i, j < n, let A^^^ be the n x n matrix obtained from A by 
deleting row i and column j. Prove that A has rank < n if and only if 
det(A^’^) = 0 for all i^j. Hint: To have rank > n, it must be possible 
to find n columns which are linearly independent. Then, looking at the 
submatrix formed by these columns, it must be possible to find n rows 
which are linearly independent. This leads to one of the matrices . 

d. Let Y = V(det(A^’^) : 0 < i, j < n). Show that Y C 7t(IF) and that 

Y ^ 7 t(IF). Since 7 t(IF) is irreducible, standard arguments show that Y 
has dimension strictly smaller than 7 t(IF) (see, for example. Corollary 2 
to Theorem 4 of [Sha], §6 of Chapter I). 

e. Show that if a, 6 G IF and 7r(a) = 7 t(6) G 7t(IF) \ F, then a = b. Since 

Y has strictly smaller dimension than 7 t(IF), this is a precise version of 
what we mean by saying the map IF — > 7 t(IF) is “usually one-to-one”. 
Hint: Use parts b and c. 
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§3 Properties of Resultants 

In Theorem (2.3), we saw that the resultant Res(Fo, . . . , Fn) vanishes if 
and only if Fq = • • • = = 0 has a nontrivial solution, and is irreducible 

over C when regarded as a polynomial in the coefficients of the F^. These 
conditions characterize the resultant up to a constant, but they in no way 
exhaust the many properties of this remarkable polynomial. This section 
will contain a summary of the other main properties of the resultant. No 
proofs will be given, but complete references will be provided. 

Throughout this section, we will fix total degrees do, . . . , dn > 0 and let 
Res = ReSdQ,...^^ E be the resultant polynomial from §2. 

We begin by studying the degree of the resultant. 

(3.1) Theorem. For a fixed j between 0 and n, Res is homogeneous in 
the variables Uj^on \^\ = dj, of degree do • * • dj-\dj^\ • • • d^. This means 
that 

Res(Fo, Fn) = , F„). 

Furthermore, the total degree of Res is * * * dj-idj^i • • • dn- 

Proof. A proof can be found in §2 of [Jou] or Chapter 13 of [GKZ]. □ 

Exercise 1. Show that final assertion of Theorem (3.1) is an immediate 
consequence of the formula for Res(Fo, . . . , AFj, . . . , F^). Hint: What is 
Res(AFo, . . . , AFn)? 

Exercise 2. Show that formulas (1.2) and (2.8) for Resi^rn and Res 2 , 2,2 
satisfy Theorem (3.1). 

We next study the symmetry and multiplicativity of the resultant. 

(3.2) Theorem. 

a. Ifi< j, then 

Res(Fo, . . . , Fi, . . . , . . . , F,^) = 

(-l)‘^o'<inRes(Fo, ...,Fj,...,Fi,..., Fn), 

where the bottom resultant is for degrees do, . . . , dj, . . . , d^, . . . , d^. 

b. If Fj = FjFj is a product of homogeneous polynomials of degrees d' 
and dj, then 

Res(Fo, ...,Fj,...,Fn) = 

Res(Fo, . . . , Fj, . . . , F„) • Res(Fo, . . . , F", . . . , F„), 

where the resultants on the bottom are for degrees do, . . . ,dp . . . ,dn and 
do, . . • , dj , . . , , dn> 
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Proof. A proof of the first assertion of the theorem can be found in §5 of 

[Jou]. As for the second, we can assume j = nhy part a. This case will be 

covered in Exercise 9 at the end of the section. □ 

Exercise 3. Prove that formulas (1.2) and (2.8) for Res^^m and Res 2 , 2,2 
satisfy part a of Theorem (3.2). 

Our next task is to show that the analog of Proposition (1.5) holds 
for general resultants. We begin with some notation. Given homogeneous 
polynomials Fq,. • . ,Fn G C[xq, . . . , Xn] of degrees do? • • • ? dn, let 

/i(^0j • • • ) ^n— l) -^i(^O) • • • j ^n— Ij 1) 

F i{XQj • • • j ^n— l) ~ ^ i(^0) • • • > ^n— 1? 0)* 

Note that Fq? • • • ? Fn-i are homogeneous in C[xq, . . • , Xn-i] of degrees 

do, • • • j dfi—i' 

(3.4) Theorem. If Res(Fo» • • • ? ^n-i) ^ 0, then the quotient ring A = 
C[xo, . . . , Xn-i]/ (/o, • • • , fn-i) has dimension do • • • d^-i as a vector space 
over C, and 

Res(Fo, ...,Fn) = Res(Fo, . . . , Fn-i)’^" det(m/„ : A A), 

where : A A is the linear map given by multiplication by fn . 

Proof. Although we will not prove this result (see [Jou], §§2, 3 and 4 for a 
complete proof), we will explain (non-rigorously) why the above formula is 
reasonable. The first step is to show that the ring A is a finite dimensional 
vector space over C when Res(Fo> • • • ? J^n-i) ^ 0- Tho crucial idea is to 
think in terms of the projective space P^. We can decompose P’^ into two 
pieces using Xn’ the afBne space C P^ defined by Xn = 1, and the 
“hyperplane at infinity” P’^“^ C P’^ defined by Xn = 0. Note that the 
other variables xq, . . . , Xn-i play two roles: they are ordinary coordinates 
for C P’^, and they are homogeneous coordinates for the hyperplane at 
infinity. 

The equations Fq = * * * = Fi-i = 0 determine a projective variety V C 
P’^. By (3.3), /o = * • • = fn-i = 0 defines the “aflfine part” CiV C V, 
while Fo = • • • = F^-i = 0 defines the “part at infinity” P^~i nV C V. 
Hence, the hypothesis Res(Fo, . . . , F^-i) i=- 0 implies that there are no 
solutions at infinity. In other words, the projective variety V is contained in 
Cn c P’^. Now we can apply the following result from algebraic geometry: 

• (Projective Varieties in Affine Space) If a projective variety in is 
contained in an affine space C P’^, then the projective variety must 
consist of a finite set of points. 

(See, for example, [Sha], §5 of Chapter I.) Applied to V, this tells us that V 
must be a finite set of points. Since C is algebraically closed and V C 
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is defined by /o = • • • = fn-i = 0, the Finiteness Theorem from §2 
of Chapter 2 implies that A = C[xo, . . . , Xn-i]/(/o, • • • ? fn-i) is finite 
dimensional over C. Hence det(m/„ : ^4 — > A) is defined, so that the 
formula of the theorem makes sense. 

We also need to know the dimension of the ring A. The answer is provided 
by Bezout’s Theorem: 

• (Bezout’s Theorem) If the equations Fq = • • • = Fn-i = 0 have de- 
gree do, , dn-i and finitely many solutions in then the number of 
solutions (counted with multiplicity) is do • • • d^-i. 

(See [Sha], §2 of Chapter II.) This tells us that V has do-*-dn-i 
points, counted with multiplicity. Because F C is defined by /o = 

• • • = fn-i = 0, Theorem (2.2) from Chapter 4 implies that the 
number of points in V, counted with multiplicity, is the dimension of 
A = C[a;o, . . . , Xn-i]/(/o, . • . , /n-i)* Thus, Bezout’s Theorem shows that 
dim A = do • • • dn-i. 

We can now explain why Res(Fo, . . . , Fn-iY^ det(m/„) behaves like a 
resultant. The first step is to prove that det(m/^) vanishes if and only if 



Fo = • • • = Fn = 0 has a solution in P^. If we have a solution p, then 
p ^ V since Fo(p) = • • • = Fn-i{p) = 0. But V C C’^, so we can write 



p = (ao, . . . , ttn-i, 1), and /n(oo, . . . , Un-i) = 0 since F^(p) = 0. Then 
Theorem (2.6) of Chapter 2 tells us that /n(ao, . . . , dn-i) = 0 is an eigen- 
value of m/„, which proves that det(m/^) = 0. Conversely, if det(m/^) = 0, 
then one of its eigenvalues must be zero. Since the eigenvalues are fn{p) 
forpeV (Theorem (2.6) of Chapter 2 again), we have fn{p) = 0 for some 
p. Writing p in the form (ao, . . . , a^-i, 1), we get a nontrivial solution of 
Fo = • • • = Fn = 0, as desired. 

Finally, we will show that Res(Fo, . . . , F^-i)^” det(m/„) has the homo- 
geneity properties predicted by Theorem (3.1). If we replace Fj by \Fj for 
some j < n and A G C \ {0}, then XFj = XFj, and neither A nor are 
aflPected. Since 

Res(Fo, . . . , AFj, . . . , Fn-i) = 

. . . , Fj, . . . , Fn_i), 

we get the desired power of A because of the exponent d^ in the for- 
mula of the theorem. On the other hand, if we replace Fn with AFn, then 
Res(Fo, . . . , Fn-i) and A are unchanged, but becomes m\f^ = Xnif^. 
Since 



det(Am/^) = det(m/^) 

it follows that we get the correct power of A because, as we showed above, 
A has dimension do • • • dn-i- 

This discussion shows that the formula Res(Fo, . . . , F^-i)^” det(m/^) 
has many of the properties of the resultant, although some important points 
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were left out (for example, we didn’t prove that it is a polynomial in the 
coefficients of the Fi). We also know what this formula means geometrically: 
it asserts that the resultant is a product of two terms, one coming from 
the behavior of Fq, . . . , F„_i at infinity and the other coming from the 
behavior of fn = Fn{xo ^ . . . , x^-i, 1) on the affine variety determined by 
vanishing of /o, . . . , /n-i* □ 

Exercise 4. When n = 2, show that Proposition (1.5) is a special case 
of Theorem (3.4). Hint: Start with f^g as in (1.1) and homogenize to get 
(1.6). Use Exercise 6 of §2 to compute Res(F). 

Exercise 5. Use Theorem (3.4) and getmatrix to compute the resultant 
of the polynomials x‘^ y‘^ + ^ xy ^ xz yz, xyz. 

The formula given in Theorem (3.4) is sometimes called the Poisson 
Formula. Some further applications of this formula will be given in the 
exercises at the end of the section. 

In the special case when Fq, . . . , all have the same total degree d > 0, 
the resultant Resd,.,.,d has degree (F in the coefficients of each Fi^ and its 
total degree is (n + 1)(F. Besides all of the properties listed so far, the 
resultant has some other interesting properties in this case: 

(3.5) Theorem. Res = Resd,...,d has the following properties: 

a. If Fj are homogeneous of total degree d and Gi = '^here 

{aij) is an invertible matrix with entries in C, then 

Res(Go5 • • • j Gn) = det{aijy^Res{Fo , . . . , Fn). 

b. If we list all monomials of total degree d as x^^^\ . . . , x^^^^ and pick 
n H- 1 distinct indices 1 < i^ < — •< in < N ^ the bracket [u . . . in] is 
defined to be the determinant 

[io • • • iri\ — det(“U^ G 

Then Res is a polynomial in the brackets [zo . . . in]- 

Proof. See Proposition 5.11.2 of [Jou] for a proof of part a. For part b, 
note that if (aij) has determinant 1, then part a implies Res(Go, • • • ? Gn) = 
Res(Fo, . . . , Fn), so Res is invariant under the action of SL(n + 1, C) = 
{A G M(n-|-i)x(n+i)(C) : det(A) = 1} on (n + l)-tuples of homogenous 
polynomials of degree d. If we regard the coefficients of the universal poly- 
nomials Fi as an (n + 1) X iV matrix {ui^a(j))^ then this action is matrix 
multiplication by elements of SL(n-fl, C). Since Res is invariant under this 
action, the First Fundamental Theorem of Invariant Theory (see [Stul], 
Section 3.2) asserts that Res is a polynomial in the (n + 1) x (n H- 1) 
minors of {ui^ot{j))-> which are exactly the brackets [to • • • in\- □ 
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Exercise 6. Show that each bracket [io . . . in] — det(t/i^cK(ij)) is invariant 
under the action of SL(n + 1, C). 

We should mention that the expression of Res in terms of the brackets 
[io . . . in] is not unique. The different ways of doing this are determined 
by the algebraic relations among the brackets, which are described by 
the Second Fundamental Theorem of Invariant Theory (see Section 3.2 
of [Stul]). 

As an example of Theorem (3.5), consider the resultant of three ternary 
quadrics 

Fs = coia;^ + co22/^ 4- co3^^ + co^xy + cq^xz + co^yz = 0 

Fi = ciix^ + ci2y^ 4- ci3Z^ + cuxy + ci^xz 4- cieyz = 0 

F 2 = C2ia:^ + C22y^ 4- 023 ^^^ 4- C2Axy 4- C2^xz + C2Qyz = 0. 

In §2, we gave a formula for Res 2 , 2 , 2 (i^ 0 ) F 2 ) as a certain 6x6 determi- 

nant. Using Theorem (3.5), we get quite a different formula. If we list the 
six monomials of total degree 2 as x^^^y^^^ z^^xy^xz,yz^ then the bracket 
[i^iih] is given by 

( COio CQh Coi2 

Clio ^1*2 

C2io C2ii C2^2 

By [KSZ], the resultant Res 2 , 2 , 2 (^b> ^’ 11 ^ 2 ) is the following polynomial in 
the brackets [* 0 * 1 * 2 ]: 

[145] [246] [356] [456] - [146] [156] [246] [356] - [145] [245] [256] [356] 

- [145] [246] [346] [345] + [125] [126] [356] [456] - 2[124][156][256][356] 

- [134] [136] [246] [456] - 2[135][146][346][246] + [235] [234] [145] [456] 

- 2[236][345][245][145] - [126]2[156][356] - [125]^[256][356] 

- [134]2[246][346] - [136]^ [146] [246] - [145] [245] [235]^ 

- [145][345][234]2 + 2[123][124][356][456] - [123] [125] [346] [456] 

- [123] [134] [256] [456] + 2 [123] [135] [246] [456] - 2[123][145][246][356] 

- [124]2[356]2 + 2[124][125][346][356] - 2[124][134][256][356] 

- 3[124][135][236][456] - 4[124][135][246][356] - [125]^[346]2 
+ 2[125][135][246][346] - [134]^ [256]=^ + 2[134][135][246][256] 

- 2[135]^[246]2 - [123] [126] [136] [456] + 2[123][126][146][356] 

- 2[124][136]^[256] - 2[125][126][136][346] + [123] [125] [235] [456] 

- 2[123][125][245][356] - 2[124][235]2[156] - 2[126][125][235][345] 

- [123] [234] [134] [456] + 2[123][234][346][145] - 2[236][134]2[245] 

- 2[235][234][134][146] + 3[136][125][235][126] - 3[126][135][236][125] 

- [136][125]2[236] - [126]2[135][235] - 3[134][136][126][234] 

+ 3[124][134][136][236] + [134]^[126][236] + [124][136]^[234] 
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- 3[124][135][234][235] + 3[134][234][235][125] - [135][234]2[125] 

- [124][235]2[134] - [136]2[126]2 - [125]^[235]^ 

- [134]2[234]2 + 3[123][124][135][236] + [123] [134] [235] [126] 

+ [123] [135] [126] [234] + [123] [134] [236] [125] + [123] [136] [125] [234] 

+ [123] [124] [235] [136] - 2[123]^[126][136] + 2[123]2[125][235] 

- 2[123]2[134][234] - [123]^. 

This expression for Res2,2,2 has total degree 4 in the brackets since the 
resultant has total degree 12 and each bracket has total degree 3 in the Cij. 
Although this formula is rather complicated, its 68 terms are a lot simpler 
than the 21,894 terms we get when we express Res2,2,2 as a polynomial in 
the Cij\ 



Exercise 7. When Fq = + aixy 4- a 2 y^ and Fi = hox^^ + hixy + 622/^? 

the only brackets to consider are [01] = aobi — aibo, [02] = aob 2 — ^260 
and [12] = ai&2 — ^2^1 (why?). Express Res2,2 as a polynomial in these 
three brackets. Hint: In the determinant (1.2), expand along the first row 
and then expand along the column containing the zero. 

Theorem (3.5) also shows that the resultant of two homogeneous poly- 
nomials Fo{x,y)^ Fi{x^y) of degree d can be written in terms of the 
brackets [ij]. The resulting formula is closely related to the Bezout Formula 
described in Chapter 12 of [GKZ]. 

For further properties of resultants, the reader should consult Chapter 13 
of [GKZ] or Section 5 of [Jou]. 



Additional Exercises for §3 

Exercise 8. The product formula (1.4) can be generalized to arbi- 
trary resultants. With the same hypotheses as Theorem (3.4), let V = 
V(/o, . . . , fn-i) be as in the proof of the theorem. Then 

Res(Fo, . . . , F„) = Res(Fo , . . . , R fn(pr^^\ 

pev 

where m{p) is the multiplicity of p in V. This concept is defined in [Sha], §2 
of Chapter II, and §2 of Chapter 4. For this exercise, assume that V consists 
of do • * * d,n-i distinct points (which means that all of the multiplicities 
m(p) are equal to 1) and that fn takes distinct values on these points. 
Then use Theorem (2.6) of Chapter 2, together with Theorem (3.4), to 
show that the above formula for the resultant holds in this case. 
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Exercise 9. In Theorem (3.4), we assumed that the field was C. It turns 
out that the result is true over any field k. In this exercise, we will use this 
version of the theorem to prove part b of Theorem (3.2) when Fn = • 

The trick is to chose k appropriately: we will let k be the field of rational 
functions in the coefficients of Fq, . . . , F^-i, F^, F^, This means we regard 
each coefficient as a separate variable and then k is the field of rational 
functions in these variables with coefficients in Q. 

a. Explain why Fq, . . . , F^-i are the “universal” polynomials of degrees 
doj • • • 5 dn-i in xo, • • • , Xn-i, and conclude that Res(Fo, . . . , Fn-i) is 
nonzero. 

b. Use Theorem (3.4) (over the field k) to show that 

Res(Fo, . . . , Fn) = Res(Fo, . . . , F^) • Res(Fo, . . . , F^^). 

Notice that you need to use the theorem three times. Hint: m/^ = 

rUff orrifn. 

J n J n 



Exercise 10. The goal of this exercise is to generalize Proposition (2.10) 
by giving a formula for Resi^i^d for any d > 0. The idea is to apply Theo- 
rem (3.4) when the field k consists of rational functions in the coefficients 
of Fq, Fi, F2 (so we are using the version of the theorem from Exercise 9). 
For concreteness, suppose that 

Fo = aix + a2V + a^z = 0 
Fi = bix + b2V -h bsz = 0. 

a. Show that Res(Fo,Fi) = a\b 2 — a2&i and that the only solution of 
/o = /i = 0 is 

_ U2^3 ~ _ a\bs — asbi 

0>lb2 — U2^1 dib2 — Cb2bi 

b. By Theorem (3.4), k[x,y]/{fo, fi) has dimension one over C. Use 
Theorem (2.6) of Chapter 2 to show that 

det(m/J = /2 (xo, 2 /o)- 

c. Since f 2 {x, y) = F 2 (x, 2/? 1)? ^se Theorem (3.4) to conclude that 

Resi,i,d(Fo, Fi, F2) = F 2 {a 2 bs — asb 2 , — (ai ^3 ~ ^361), aib2 — ^261). 

Note that 0263 — asb 2 , aibs — a^bi^ 0162 — ^261 are the 2x2 minors of 
the matrix 

/ ai d 2 ds \ 

\ b\ 62 bs J 

d. Use part c to verify the formula for Resi,i,2 given in Proposition (2.10). 

e. Formulate and prove a formula similar to part c for the resultant 

Hint: Use Cramer’s Rule. The formula (with proof) can be 
found in Proposition 5.4.4 of [Jou]. 
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Exercise 11. Consider the elementary symmetric functions cri, . . . , G 
C[xi, . . . , Xn]- These are defined by 

ai = Xi -\ ]r Xn 



^ ^ ^ 11^22 * * ‘ ^ir 



dji — X\X‘2 * * ’ ^n* 

Since ai is homogeneous of total degree z, the resultant Res(cri, . . . ,cr^) 
is defined. The goal of this exercise is to prove that this resultant equals 
— 1 for all n > 1. Note that this exercise deals with n polynomials and n 
variables rather than n + 1. 

a. Show that Res(x + y, xy) = —1. 

b. To prove the result for n > 2, we will use induction and Theorem (3.4). 
Thus, let 

^2 — • • • ? ^n— Ij fi) 

Gi = <^ 2 (^ 1 ? • • • ) ^n— 1? 1) 

as in (3.3). Prove that ai is the ith elementary symmetric function in 
a;i, . . . , Xn-\ and that d-j = (where gq = 1). 

c. If ^ = C[xi, . . . , a;n-i]/(^i, • • • , d-n-i), then use part b to prove that 
the multiplication map : A ^ A is multiplication by (—1)’^. Hint: 
Observe that Gn = ^n-i- 

d. Use induction and Theorem (3.5) to show that Res(cri, . . . , Gn) = — 1 
for all n > 1. 

Exercise 12. Using the notation of Theorem (3.4), show that 
Res(Fo, . . . , Fn-i,xf^) = Res(Fo, . . . , Fn-i)^- 



§4 Computing Resultants 

Our next task is to discuss methods for computing resultants. While Theo- 
rem (3.4) allows one to compute resultants inductively (see Exercise 5 of §3 
for an example), it is useful to have other tools for working with resultants. 
In this section, we will give some further formulas for the resultant and 
then discuss the practical aspects of computing ReSdo,...,dn- We will begin 
by generalizing the method used in Proposition (2.10) to find a formula for 
Resi,i, 2 - Recall that the essence of what we did in (2.11) was to multiply 
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each equation by appropriate monomials so that we got a square matrix 
whose determinant we could take. 

To do this in general, suppose we have Fq, . . . , G C[a:o, . . . , Xn] of 
total degrees do? • • • 5 Then set 

n n 

d = ^{di - 1) + 1 = dj - n. 

1=0 1=0 

For instance, when (do, di, d2) = (1, 1, 2) as in the example in Section 2, 
one computes that d = 2, which is precisely the degree of the monomials 
on the left hand side of the equations following (2.11). 

Exercise 1. Monomials of total degree d have the following special prop- 
erty which will be very important below: each such monomial is divisible 
by xf" for at least one i between 0 and n. Prove this. Hint: Argue by 
contradiction. 

Now take the monomials x“ = of total degree d and divide 

them into n sets as follows: 

50 = {x^ : |a| = d, Xq° divides x"} 

51 = {x" : |a| = d, Xq° doesn’t divide but x^^ does} 

Sn = {x^ : |a| = d, Xq®, • • . , don’t divide x^ but x^” does}. 

By Exercise 1, every monomial of total degree d lies in one of So, Sn- 
Note also that these sets are mutually disjoint. One observation we will 
need is the following: 

if x^ e Si, then we can write x“ = xf* • x“/xf\ 

Notice that x"/xf" is a monomial of total degree d — di since x^ £ Si. 

Exercise 2. When (do,di,d2) = (1,1,2), show that So = {x^,xy,xz}. 
Si = {y‘^,yz}, and S 2 = where we are using x,y,z as variables. 

Write down all of the x"/xf* in this case and see if you can find these 
monomials in the equations (2.11). 

Exercise 3. Prove that the number of monomials in Sn is exactly 
do * • * dn-i. This fact will play an extremely important role in what fol- 
lows. Hint: Given integers ao, ... , On-i with 0 < < d^ — 1, prove that 

there is a unique such that Xq° • • • x^^ G Sn- Exercise 1 will also be 
useful. 
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Now we can write down a system of equations that generalizes (2.11). 
Namely, consider the equations 

Fo = 0 for all x°‘ G So 

(4.1) : 

= 0 for all x°‘ £ Sn- 

Exercise 4. When (do,di,d 2 ) = check that the system of 

equations given by (4.1) is exactly what we wrote down in (2.11). 

Since Fi has total degree di, it follows that • Fi has total degree 

d. Thus each polynomial on the left side of (4.1) can be written as a linear 
combination of monomials of total degree d. Suppose that there are N such 
monomials. (In the exercises at the end of the section, you will show that N 
equals the binomial coefficient (^^^)-) Then observe that the total number 
of equations is the number of elements in 5o U • • • U 5n, which is also N, 
Thus, regarding the monomials of total degree d as unknowns, we get a 
system of N linear equations in N unknowns. 

(4.2) Definition. The determinant of the coefficient matrix of the N x N 
system of equations given by (4.1) is denoted Dn> 



For example, if we have 



(4.3) 



Fo = aix + a2V + aaz = 0 
Fi = bix + 622 / + = 0 

F 2 = Cix^ + C2j/^ + + CAxy + C5XZ + Coyz = 0, 



then the equations following (2.11) imply that 



(4.4) 



£>2 = det 



Oi 

0 

0 

0 

0 

\ci 



0 

02 

0 

?»2 

0 

C2 



0 

0 

03 

0 

^>3 

C3 



O 2 

Ol 

0 

61 

0 

C 4 



03 

0 

Ol 

0 

C 5 




Exercise 5. When we have polynomials Fq, Fi € C[x, y] as in (1.6), show 
that the coefficient matrix of (4.1) is exactly the transpose of the matrix 

(1.2). Thus, Di = Res(Fo, Fi) in this case. 



Here are some general properties of Dn‘. 
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Exercise 6, Since is the determinant of the coefEcient matrix of (4.1), 
it is clearly a polynomial in the coefEcients of the Fi. 

a. For a fixed i between 0 and n, show that Dn is homogeneous in the 
coefficients of Fi of degree equal to the number fii of elements in Si. 
Hint: Show that repacing Fi by XFi has the effect of mult plying a certain 
number (how many?) equations of (4.1) by A. How does this affect the 
determinant of the coefficient matrix? 

b. Use Exercise 3 to show that Dn has degree do * * * dn-i as a polynomial 
in the coefficents of Hint: If you multiply each coefficient of Fn by 
A G C, show that Dn gets multiplied by \do'-’dn-i^ 

c. What is the total degree of Dn^. Hint: Exercise 19 will be useful. 

Exercise 7. In this exercise, you will prove that Dn is divisible by the 
resultant. 

a. Prove that Dn vanishes whenever Fq = • • • = = 0 has a nontrivial 

solution. Hint: If the Fi all vanish at (cq, . . . , Cn) ^ (0, . . . , 0), then 
show that the monomials of total degree d in cq, Cn give a nontrivial 
solution of (4.1). 

b. Using the notation from the end of §2, we have V (Res) C C^, where 

is the affine space whose variables are the coefficients Ui^^ of Fq, . . . , Fn. 
Explain why part a implies that Dn vanishes on V(Res). 

c. Adapt the argument of Proposition (2.10) to prove that Dn ^ (Res), so 
that Res divides Dn- 

Exercise 7 shows that we are getting close to the resultant, for it enables 
us to write 

(4.5) Dn = Res • extraneous factor. 

We next show that the extraneous factor doesn’t involve the coefficients of 
Fn and in fact uses only some of the coefficients of Fq, . . . , Fn-i. 

(4.6) Proposition. The extraneous factor in (4-5) is an integer polyno- 
mial in the coefficients of Fq, ^ Fn-i, where Fi = Fi{xo, . . . , Xn-i, 0). 

Proof. Since Dn is a determinant, it is a polynomial in Z[ui^a]^ and we 
also know that Res G Z[ui^a\- Exercise 7 took place in C[ui^a] (because of 
the Nullstellensatz), but in fact, the extraneous factor (let’s call it En) must 
lie in Q[ui,a] since dividing Dn by Res produces at worst rational coeffi- 
cients. Since Res is irreducible in Z[ui^a]^ standard results about polynomial 
rings over Z imply that En G Z[ui^a] (see Exercise 20 for details). 

Since Dn = Res • En is homogeneous in the coefficients of Fn, Exercise 20 
at the end of the section implies that Res and En are also homogeneous 
in these coefficients. But by Theorem (3.1) and Exercise 6, both Res and 
Dn have degree do • • • dn-i in the coefficients of Fn. It follows immediately 
that En has degree zero in the coefficients of Fn, so that it depends only 
on the coefficients of Fq, . . . , Fn_i. 
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To complete the proof, we must show that En depends only on the coef- 
ficients of the Fi. This means that coefiicients of Fq, . . . , Fn-i with Xn to 
a positive power don’t appear in En> To prove this, we use the following 
clever argument of Macaulay (see [Mad]). As above, we think of Res, Dn 
and En as polynomials in the and we define the weight of Ui^a to be 
the exponent an of Xn (where a = (ao, . . . , On))- Then, the weight of a 
monomial in the Ui^oc^ say defined to be the sum of the 

weights of each multiplied by the corresponding exponents. Finally, a 
polynomial in the is said to be isobaric if every term in the polynomial 
has the same weight. 

In Exercise 23 at the end of the section, you will prove that every term 
in Dn has weight do ••• dn, so that Dn is isobaric. The same exercise will 
show that Dn = Res • En implies that Res and En are isobaric and that the 
weight of Dn is the sum of the weights of Res and En- Hence, it suffices to 
prove that En has weight zero (be sure you understand this). To simplify 
notation, let ui be the variable representing the coefficient of xf" in Fi. 
Note that uo, - - - , Un-i have weight zero while Un has weight dn- Then 
Theorems (2.3) and (3.1) imply that one of the terms of Res is 

"’dn ”'dn ^ ^ ^ ydo'“dn — l 

(see Exercise 23). This term has weight do - •• dn, which shows that the 
weight of Res is do * * * dn- We saw above that Dn has the same weight, and 
it follows that En has weight zero, as desired. □ 

Although the extraneous factor in (4.5) involves fewer coefficients than 
the resultant, it can have a very large degree, as shown by the following 
example. 

Exercise 8. When d* = 2 for 0 < i < 4, show that the resultant has total 
degree 80 while D^ has total degree 420. What happens when di = 3 for 
0 < i < 4? Hint: Use Exercises 6 and 19. 

Notice that Proposition (4.6) also gives a method for computing the 
resultant: just factor Dn into irreducibles, and the only irreducible factor 
in which all variables appear is the resultant! Unfortunately, this method 
is wildly impractical owing to the slowness of multivariable factorization 
(especially for polynomials as large as Dn)- 

In the above discussion, the sets So, - - - ,Sn and the determinant Dn de- 
pended on how the variables xo, - - - ,Xn were ordered. In fact, the notation 
Dn was chosen to emphasize that the variable Xn came last. If we fix i 
between 0 and n — 1 and order the variables so that Xi comes last, then 
we get slightly different sets So, - - - ,Sn and a slightly different system of 
equations (4.1). We will let Di denote the determinant of this system of 
equations. (Note that there are many different orderings of the variables 
for which Xi is the last. We pick just one when computing Di.) 
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Exercise 9. Show that Di is homogeneous in the coefficients of each Fj 
and in particular, is homogeneous of degree do * * * i • • • dn in the 

coefficients of Fi. 

We can now prove the following classical formula for Res. 

(4.7) Proposition. When Fq, . . . , Fn are universal polynomials as at the 

end of ^2, the resultant is the greatest common divisor of the polynomials 
Do? • • • j the ring Z[ui^a]y 

Res = ±GCD(Do, . . . , D^). 

Proof. For each i, there are many choices for Di (corresponding to the 
(n — 1)! ways of ordering the variables with Xi last). We need to prove that 
no matter which of the various Di we pick for each i, the greatest common 
divisor of Do, ... , Dn is the resultant (up to a sign). 

By Exercise 7, we know that Res divides Dn^ and the same is clearly 
true for Do, . . . ,Dn-i. Furthermore, the argument used in the proof of 
Proposition (4.6) shows that Di = Res • Ei, where Ei G Z[ui^a] doesn’t 
involve the coefficients of Fi. It follows that 

GCD(Do, . . . , D^) = Res • GCD(Do, . . . , D^). 

Since each Ei doesn’t involve the variables Ui^a^ the GCD on the right 
must be constant, i.e., an integer. However, since the coefficients of Dn are 
relatively prime (see Exercise 10 below), this integer must be ±1, and we 
are done. Note that GCD’s are only determined up to invertible elements, 
and in Z[ui^a]i the only invertible elements are ±1. □ 

Exercise 10. Show that Dn(xo°, • . . = il, and conclude that as 

a polynomial in Z[ui^a]i the coefficients of Dn are relatively prime. Hint: 
If you order the monomials of total degree d appropriately, the matrix of 
(4.1) will be the identity matrix when Fi = x^\ 

While the formula of Proposition (4.7) is very pretty, it is not particularly 
useful in practice. This brings us to our final resultant formula, which will 
tell us exactly how to find the extraneous factor in (4.5). The key idea, 
due to Macaulay, is that the extraneous factor is in fact a minor (i.e., the 
determinant of a submatrix) of the N x N matrix from (4.1). To describe 
this minor, we need to know which rows and columns of the matrix to 
delete. Recall also that we can label the rows and columns the matrix of 
(4.1) using all monomials of total degree d = XlILo Given such a 

monomial x". Exercise 1 implies that xf* divides x^ for at least one i. 

(4.8) Definition. Let do, ... ,dn and d be as usual. 

a. A monomial x" of total degree d is reduced if xf " divides x^ for exactly 
one i. 
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b. is the determinant of the submatrix of the coefficient matrix of (4.1) 
obtained by deleting all rows and columns corresponding to reduced 
monomials x^. 



Exercise 11. When (do, di, c? 2 ) = (1, 1, 2), we have d = 2. Show that all 
monomials of degree 2 are reduced except for xy. Then show that the Dg = 
tti corresponding to the submatrix (4.4) obtained by deleting everything 
but row 2 and column 4. 



Exercise 12. Here are some properties of reduced monomials and D'^. 

a. Show that the number of reduced monomials is equal to 

n 

^ ^ do * * • dj—idj^i • • • d^j. 

Hint: Adapt the argument used in Exercise 3. 

b. Show that D!^ has the same total degree as the extraneous factor in (4.5) 
and that it doesn’t depend on the coefficients of Hint: Use part a 
and note that all monomials in Sn are reduced. 



Macaulay’s observation is that the extraneous factor in (4.5) is exactly 
up to a sign. This gives the following formula for the resultant as a 
quotient of two determinants. 



(4.9) Theorem. When Fo, . . . , are universal polynomials, the resul- 
tant is given by 



Res = 



, Dn 
DL‘ 



Further, if k is any field and Fo, . . . , Fn G k[xo, . . . , Xn], then the above 
formula for Res holds whenever ^ 0. 



Proof. The only proof we are aware of is in Macaulay’s original paper 
[Mac2]. □ 



Exercise 13. Using xq, x\,X 2 as variables with xq regarded as last, write 
Resi, 2,2 as a quotient Dq/Dq of two determinants and write down the 
matrices involved (of sizes 10 x 10 and 2x2 respectively). The reason for 
using Fo/Fq instead of F 2 /F 2 will become clear in Exercise 2 of §5. A 
similar example is worked out in detail in [BGW]. 



While Theorem (4.9) applies to all resultants, it has some disadvantages. 
In the universal case, it requires dividing two very large polynomials, which 
can be very time consuming, and in the numerical case, we have the awk- 
ward situation where both F^ and F^ vanish, as shown by the following 
exercise. 
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Exercise 14. Give an example of polynomials of degrees 1, 1, 2 for which 
the resultant is nonzero yet the determinants D 2 and D 2 both vanish. Hint: 
See Exercise 10. 



Because of this phenomenon, it would be nice if the resultant could be 
expressed as a single determinant, as happens with Res^, rn- It is not known 
if this is possible in general, though many special cases have been found. We 
saw one example in the formula (2.8) for Res 2 , 2 , 2 * This can be generalized 
(in several ways) to give formulas for Res/,;,; and Resi^i^i^i when I > 2 (see 
[GKZ], Chapter 3, §4 and Chapter 13, §1, and [Sal], Arts. 90 and 91). As an 
example of these formulas, the following exercise will show how to express 
Resi^i^i as a single determinant of size 2P - I when I > 2. 



Exercise 15. Suppose that Fq, Fi, F 2 G C[a;, y, z] have total degree I > 2. 
Before we can state our formula, we need to create some auxilliary equa- 
tions. Given nonnegative integers a, 6, c with a + 64-c = / — 1, show that 
every monomial of total degree I in x, y, 2 ; is divisible by either y^“*"^, 
or and conclude that we can write Fq, Fi, F 2 in the form 

Fo = x^^^Po -f y^+'go + 

(4.10) Fi = x“+^Fi + y^+^Qi + 

F 2 = x“+^F 2 + y^+^Q2 + z^^^R2. 

There may be many ways of doing this. We will regard Fq, Fi , F 2 as univeral 
polynomials and pick one particular choice for (4.10). Then set 



-I^a,6,c — det 



Po 


Qo 




Pi 


Qi 


Ri 


P2 


Q2 


R2 



You should check that Fa,b,c has total degree 21 — 2. 
Then consider the equations 



(4.11) 



-Fo = 0 , 

• Fi = 0, 

• F 2 = 0, 
Fa,b,c = 0 , 



x" of total degree I — 2 
x" of total degree I — 2 
x" of total degree I — 2 
x^y^z^ of total degree / — 1. 



Each polynomial on the left hand side has total degree 2Z — 2, and you 
should prove that there are 2t^ — I monomials of this total degree. Thus we 
can regard the equations in (4.11) as having 2P — I unknowns. You should 
also prove that the number of equations is 2P — 1. Thus the coefficient 
matrix of (4.11), which we will denote C/, is a {2P — 1) x {2P — 1) matrix. 

In the following steps, you will prove that the resultant is given by 



ResM,/(Fo,Fi,F 2 ) = ±det(Cz). 
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a. If (iz, w) ^ (0, 0, 0) is a solution of Fq = Fi = F2 = 0, show that 
Fa,h,c vanishes at {u^ v, w). Hint: Regard (4.10) as a system of equations 
in unknowns 

b. Use standard arguments to show that Resi^i^i divides det{Ci). 

c. Show that det{Ci) has degree F in the coefRcients of Fq. Show that the 
same is true for Fi and F2. 

d. Conclude that Res/,z,z is a multiple of det{Ci), 

e. When (Fo,Fi,F2) = (x\y\z^), show that det(C/) = ±1. Hint: Show 

that Fa,6,c = monomials of total degree 

21 — 2 not divisible by z^ can be written uniquely in this form. Then 
show that Cl is the identity matrix when the equations and monomials 
in (4.11) are ordered appropriately. 

f. Conclude that Res/,;,/(Fo, Fi, F2) = ± det(Q). 

Exercise 16. Use Exercise 15 to compute the following resultants. 

a. Res(a;^ + 2/^ + z‘^, xy xz yz^ -h 2xz + Sy‘^), 

b. Res(st 4- suH-tu + u^(l — x), + + + u^(2 — y), su -{- tu — z) , 

where the variables are s, t, u, and x, y, 2: are part of the coefRcients. 
Note that your answer should agree with what you found in Exercise 3 
of §2. 

We will end this section with a brief discussion of some of the practical 
aspects of computing resultants. All of the methods we’ve seen involve 
computing determinants or ratios of determinants. Since the usual formula 
for a AT X AT determinant involves N\ terms, we will need some clever 
methods for computing large determinants. 

As Exercise 16 illustrates, the determinants can be either numerical, 
with purely numerical coefficients (as in part a of the exercise), or sym- 
bolic, with coefficients involving other variables (as in part b). Let’s begin 
with numerical determinants. In most cases, this means determinants whose 
entries are rational numbers, which can be reduced to integer entries by 
clearing denominators. The key idea here is to reduce modulo a prime p and 
do arithmetic over the finite field ¥p of the integers mod p. Computing the 
determinant here is easier since we are working over a field, which allows 
us to use standard algorithms from linear algebra (using row and column 
operations) to find the determinant. Another benefit is that we don’t have 
to worry how big the numbers are getting (since we always reduce mod p). 
Hence we can compute the determinant mod p fairly easily. Then we do this 
for several primes pi, . . . ,Pr and use the Chinese Remainder Theorem to 
recover the original determinant. Strategies for how to choose the size and 
number of primes pi are discussed in [CM] and [Man2], and the sparseness 
properties of the matrices in Theorem (4.9) are exploited in [CKLj. 

This method works fine provided that the resultant is given as a single 
determinant or a quotient where the denominator is nonzero. But when we 
have a situation like Exercise 14, where the denominator of the quotient 
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is zero, something else is needed. One way to avoid this problem, due to 
Canny [Canl], is to prevent determinants from vanishing by making some 
coefficients symbolic. Suppose we have Fq, . . . , Fn G Z[xo, . . . , Xn]- The 
determinants Dn and D'^ from Theorem (4.9) come from matrices we will 
denote and M^. Thus the formula of the theorem becomes 



Res(Fo, . . . , Fn) 



det(Mn) 

det(M^) 



provided det(M^) ^ 0. When det(M^) = 0, Canny’s method is to 
introduce a new variable u and consider the resultant 



(4.12) Res(Fo - u , . . . , Fn - u 



Exercise 17. Fix an ordering of the monomials of total degree d. Since 
each equation in (4.1) corresponds to such a monomial, we can order the 
equations in the same way. The ordering of the monomials and equations 
determines the matrices and M^. Then consider the new system of 
equations we get by replacing F^ hy Fi — u in (4.1) for 0 < z < n. 

a. Show that the matrix of the new system of equations is Mn — ul, where 
I is the identity matrix of the same size as M^. 

b. Show that the matrix we get by deleting all rows and columns corre- 
sponding to reduced monomials, show that the matrix we get is — ul 
where I is the appropriate identity matrix. 



This exercise shows that the resultant (4.12) is given by 



Res(Fo — uXq^, 



• Fn U X^ ) 



det(Mn - u I) 
"^det(M; -u7) 



since det(M^ — u I) 7 ^ 0 (it is the characteristic polynomial of M^). It 
follows that the resultant Res(Fo, . . . , Fn) is the constant term of the poly- 
nomial obtained by dividing det{Mn — ul) by det(M^ — ul). In fact, as 
the following exercise shows, we can find the constant term directly from 
these polynomials: 



Exercise 18. Let F and G be polynomials in u such that F is a multiple 
of G. Let G = brU^ + higher order terms, where br ^ 0. Then F = arU^ + 
higher order terms. Prove that the constant term of F/G is arfbr- 



It follows that the problem of finding the resultant is reduced to comput- 
ing the determinants det(Mn — u I) and det(M^ — u I). These are called 
generalized characteristic polynomials in [Canl]. 

This brings us to the second part of our discussion, the computation 
of symbolic determinants. The methods described above for the numerical 
case don’t apply here, so something new is needed. One of the most interest- 
ing methods involves interpolation, as described in [CM]. The basic idea is 
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that one can reconstruct a polynomial from its values at a suflSciently large 
number of points. More precisely, suppose we have a symbolic determinant, 
say involving variables , ^Un- The determinant is then a polynomial 
, tin)* Substituting Ui = where G Z for 0 < z < n, 
we get a numerical determinant, which we can evaluate using the above 
method. Then, once we determine £)(ao? • • • ? <^n) for sufEciently many 
points (ao, . . . , Un), we can reconstruct D(uo 5 • • • ? ^n)- Roughly speaking, 
the number of points chosen depends on the degree of D in the variables 
izo, . . . , There are several methods for choosing points (ao, . . . , n^), 
leading to various interpolation schemes (Vandermonde, dense, sparse, 
probabilistic) which are discussed in [CM]. We should also mention that 
in the case of a single variable, there is a method of Manocha [Man2] for 
finding the determinant without interpolation. 

Now that we know how to compute resultants, it’s time to put them to 
work. In the next section, we will explain how resultants can be used to 
solve systems of polynomial equations. We should also mention that a more 
general notion of resultant, called the sparse resultant, will be discussed in 
Chapter 7. 



Additional Exercises for §4 



Exercise 19. Show that the number of monomials of total degree d in 
n + 1 variables is the binomial coefficient • 

Exercise 20. This exercise is concerned with the proof of Proposi- 
tion (4.6). 

a. Suppose that E G ’L[ui^a] is irreducible and nonconstant. If F G 

is such that D = EF G then prove that F G I\ui^a\- Hint: 

We can find a positive integer m such that mF G Z[ui^o]- Then apply 
unique factorization to m • D = E • mF. 

b. Let D = EF in Z[ui^a] >and that assume that for some j, D is ho- 
mogenous in the |a| = dj. Then prove that E and F are also 
homogeneous in the Uj^a, laj = dj. 



Exercise 21. In this exercise and the next we will prove the formula for 
Res 2 , 2,2 given in equation (2.8). Here we prove two facts we will need, 
a. Prove Euler’s formula, which states that if F G k[xQ, . . . ,Xn] is 
homogeneous of total degree d, then 



n 



dF = Yl 

i=0 



dF 

dxi 



Hint: First prove it for a monomial of total degree d and then use 
linearity. 




§4. Computing Resultants 107 



b. Suppose that 

/Ai A 2 AsN 
M = det I Bi B 2 Bs I , 

\Ci C 2 Cs) 

where Ai, . . . , C 3 are in k[xo , . . . , Xn]- Then prove that 

ow / dAijdxi A 2 A^\ / Ai dA2jdxi As\ 

^ = det dB^ldxi B 2 Bs] -\- det Bi dB 2 /dxi Bs 

\dCi/dxi C 2 CsJ \Ci dC2/dxi CsJ 

/ Ai A 2 dAsfdxi \ 

+ det I Bi B 2 dBs/dxi 1 . 

\Ci C 2 dCs/dxiJ 

Exercise 22. We can now prove formula (2.8) for Res2,2,2* Fix Fq, Fi, F2 G 
C[x, 2/, z] of total degree 2. As in §2, let J be the Jacobian determinant 

/ dFo/dx dF^/dy dF^jdz \ 

J = det dFxjdx dFildy dFi/dz . 

\ 9F2 /dx dF 2 /dy dF 2 /dz ) 

a. Prove that J vanishes at every nontrivial solution of Fq = F\ — F 2 — 0. 
Hint: Apply Euler’s formula (part a of Exercise 21) to Fq, Fi, F 2. 

b. Show that 

/Fo dF^/dy dFo/dz\ 
a; • J = 2 det Fi dFi/dy dFi/dz , 

\ F2 9F2 /dy dF 2 /dz ) 

and derive similar formulas for y • J and • J. Hint: Use column 
operations and Euler’s formula. 

c. By differentiating the formulas from part b for x • J, y • J and z • J 
with respect to x, y, 2:, show that the partial derivatives of J vanish at 
all nont rival solutions of Fq = Fi = F2 = 0. Hint: Part b of Exercise 21 
and part a of this exercise will be useful. 

d. Use part c to show that the determinant in (2.8) vanishes at all nontrival 
solutions of Fo = Fi = F2 = 0. 

e. Now prove (2.8). Hint: The proof is similar to what we did in parts b-f 
of Exercise 15. 

Exercise 23. This exercise will give more details needed in the proof of 
Proposition (4.6). We will use the same terminology as in the proof. Let 
the weight of the variable Ui^a be w{ui^a)- 

a. Prove that a polynomial P{ui^a) is isobaric of weight m if and only if 

= y^P{ui,a) for all nonzero A G C. 

b. Prove that if F = QR is isobaric, then so are Q and R. Also show that 
the weight of P is the sum of the weights of Q and R, Hint: Use part a. 
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c. Prove that Dn is isobaric of weight do ••• dn- Hint: Assign the variables 

xo, . . . , Xn-i, Xn respective weights 0, . . . , 0, 1. Let be a monomial 
with I 7 I = d (which indexes a column of Dn), and let a E Si (which 
indexes a row in Dn)- If the corresponding entry in Dn is then 

show that 

w[C'y,a,i) ~ W{x'^) ~ w{x°‘/xf‘) 

= + ; = n. 

Note that and x^ range over all monomials of total degree d, 

d. Use Theorems (2.3) and (3.1) to prove that if Ui represents the coefficient 

of xf in Fi, then • • • Un° is in Res. 

§5 Solving Equations Via Resultants 

In this section, we will show how resultants can be used to solve polynomial 
systems. To start, suppose we have n homogeneous polynomials Fi, . . . , 
of degree di, . . . , in variables xq, • - - ,Xn- We want to find the nontrivial 
solutions of the system of equations 

(5.1) Fi = .-- = F^=0. 

But before we begin our discussion of finding solutions, we first need to 
review Bezout’s Theorem and introduce the important idea of genericity. 

As we saw in §3, Bezout’s Theorem tells us that when (5.1) has finitely 
many solutions in F^, the number of solutions is di • • • d^, counting multi- 
picities. In practice, it is often convenient to find solutions in affine space. 
In §3, we dehomogenized by setting = 1, but in order to be compatible 
with Chapter 7, we now dehomogenize using Xq = I- Hence, we define: 

2^ /j(^lj • • • ) Xn) — F^(l, Xi, . . . , Xn) 

F i{X\^ • • • 7 ^n) — D i(0, X\j • • • 7 Xn)- 

Note that fi has total degree at most d^. Inside P’^, we have the affine space 
£n ^ pn defined by xq = 1 , and the solutions of the affine equations 

(5.3) = ... = /, = 0 

are precisely the solutions of (5.1) which lie in C P’^. Similarly, the 
nontrivial solutions of the homogeneous equations 

Fi = . . . = Fn = 0 

may be regarded as the solutions which lie “at 00 ”. We say that (5.3) has 
no solutions at 00 if Fi = - - - = Fn = 0 has no nontrivial solutions. By 
Theorem (2.3), this is equivalent to the condition 

(5.4) Resdi,...,d^(Fi, . . . ,Fn) ^ 0. 
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The proof of Theorem (3.4) implies the following version of Bezout’s 
Theorem. 

(5.5) Theorem (Bezout’s Theorem). Assume that /i, . . . , /n CL're de- 

fined as in (5.2) and that the affine equations (5.3) have no solutions at oo. 
Then these equations have solutions (counted with multiplicity ) ^ 

and the ring 

A = C[Xi, . . .,Xn]/{fl, . . . ,/n) 

has dimension d\ • • • dn as a vector space over C. 

Note that this result does not hold for all systems of equations (5.3). In 
general, we need a language which allows us to talk about properties which 
are true for most but not necessarily all polynomials /i, . . . , /n- This brings 
us to the idea of genericity. 

(5.6) Definition. A property is said to hold generically for polynomials 
/i 5 • • • , /n of degree at most di, . . . , if there is a nonzero polynomial in 
the coefficients of the such that the property holds for all /i , . . . , /n for 
which the polynomial is nonvanishing. 

Intuitively, a property of polynomials is generic if it holds for “most” 
polynomials /i,...,/n* Our definition makes this precise by defining 
“most” to mean that some polynomial in the coefficients of the fi is non- 
vanishing. As a simple example, consider a single polynomial ax^ -\-hx-\-c. 
We claim that the property -h 6a: -f- c = 0 has two solutions, counting 
multiplicity” holds generically. To prove this, we must find a polynomial 
in the coefficients a, 6, c whose nonvanishing implies the desired property. 
Here, the condition is easily seen to be a ^ 0 since we are working over the 
complex numbers. 

Exercise 1. Show that the property “ax^ + 6x + c = 0 has two distinct 
solutions” is generic. Hint: By the quadratic formula, a(6^ — 4ac) ^ 0 
implies the desired property. 

A more relevant example is given by Theorem (5.5). Having no solutions 
at oo is equivalent to the nonvanishing of the resultant (5.4), and since 
ReSdi,...,d„(Fi, . . . , Fn) is a nonzero polynomial in the coefficients of the 
/i, it follows that this version of Bezout’s Theorem holds generically. Thus, 
for most choices of the coefficients, the equations /i = • • • = /n = 0 
have di • • • dn solutions, counting multiplicity. In particular, if we choose 
polynomials /i, . . . , /n with random coefficients (say given by some random 
number generator), then, with a very high probability, Bezout’s Theorem 
will hold for the corresponding system of equations. 
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In general, genericity comes in different “flavors”. For instance, consider 
solutions of the equation ax^ + bx c = 0: 

• Generically, ax^ + bx c = 0 has two solutions, counting multiplicity. 
This happens when a ^ 0. 

• Generically, ax^ +bx c = 0 has two distinct solutions. By Exercise 1, 
this happens when a(6^ — 4ac) ^ 0. 

Similarly, there are different versions of Bezout’s Theorem. In particular, 
one can strengthen Theorem (5.5) to prove that generically, the equations 
/i = • • • = = 0 have di • • • dn distinct solutions. This means that 

generically, (5.3) has no solutions at oo and all solutions have multiplicity 
one. A proof of this result will be sketched in Exercise 6 at the end of the 
section. 

With this genericity assumption on /i, . . . , /n, we know the number of 
distinct solutions of (5.3), and our next task is to And them. We could 
use the methods of Chapter 2, but it is also possible to And the solutions 
using resultants. This section will describe two closely related methods, 
u-resultants and hidden variables, for solving equations. The next section 
will discuss further methods which use eigenvalues and eigenvectors. 



The u-Resultant 

The basic idea of van der Waerden’s u-resultant (see [vdW]) is to start with 
the homogeneous equations F\ = • • • = Fn = 0 of (5.1) and add another 
equation Fq = 0 to (5.1), so that we have n + 1 homogeneous equations in 
n H- 1 variables. We will use 

Fq — UqXq UfiXfi^ 

where uo,...,Un are independent variables. Because the number of 
equations equals the number of variables, we can form the resultant 

(^0) -^1 J • • • J Fti), 

which is called the u-resultant. Note that the n-resultant is a polynomial 
in Uq^ * • * ) ^n* 

As already mentioned, we will sometimes work in the affine situa- 
tion, where we dehomogenize Fq, . . . , to obtain /o, • • • , /n- This is the 
notation of (5.2), and in particular, observe that 

(5.7) fo — Uq H” U\Xx "h • * * ~f" UfiXfi. 

Because /o, • • • , /n and Fq, . . . , F^ have the same coefficients, we write the 
ix-resultant as Res(/o, . . . , fn) instead of Res(Fo, . . . , F^) in this case. 

Before we work out the general theory of the iz-resultant, let’s do an 
example. The following exercise will seem like a lot of work at first, but its 
surprising result will be worth the effort. 
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Exercise 2 . Let 

Fi = X 2 — lOa^o = 0 

F2 = Xi X1X2 + 2x2 ~ = 0 

be the intersection of a circle and an ellipse in P^. By Bezout’s Theorem, 
there are four solutions. To find the solutions, we add the equation 

Fq = UqXq + UiXi + U2X2 = 0 . 



a. The theory of §4 computes the resultant using 10 x 10 determinants Dq, 
Di and D2. Using Dq, Theorem ( 4 . 9 ) implies 

Resi^2,2(^o? -^ 2 ) = =^7^* 

If the variables are ordered X2,xi^xo^ show that Dq = det(Mo), where 
Mo is the matrix 



Uo 


Ul 


1/2 


0 


0 


0 


0 


0 


0 


0\ 


0 


Uo 


0 


U2 


Ul 


0 


0 


0 


0 


0 


0 


0 


Uo 


Ul 


0 


U2 


0 


0 


0 


0 


0 


0 


0 


Uo 


0 


0 


0 


Ul 


1/2 


0 


-10 


0 


0 


0 


1 


1 


0 


0 


0 


0 


0 


-10 


0 


0 


0 


0 


1 


0 


1 


0 


0 


0 


-10 


0 


0 


0 


0 


1 


0 


1 


-16 


0 


0 


1 


1 


2 


0 


0 


0 


0 


0 


-16 


0 


0 


0 


0 


1 


1 


2 


0 


0 


0 


-16 


0 


0 


0 


0 


1 


1 


2/ 



Also show that Dq = det(Mo), where Mq is given by 



M' = 



0 




Hint: Using the order X2,xi,xq gives So = {xq^XqXi,XqX2,xqXiX2}^ 
Si = {xoXi^Xi,XiX2} and S2 = {xoX2^xiX2^X2}- The columns in Mq 
correspond to the monomials Xq, XqXi, XqX2, XqXiX2, xqXi, xqxI, xf, 
xfx2, X1X2, X2- Exercise 13 of §4 will be useful, 

b. Conclude that 

Resi,2,2(-fo> ^1? ^2) = ± { 2 uq + I 6 uf H- 361/2 — S 0 ulu 2 + I2OU1U2 
— ISu^Ui — 22ulul + 52uiul — AuluiU2). 



c. Using a computer to factor this, show that Resi,2,2(^o» ^2) equals 

{Uo + 1/1 - 3U2){Uq - 1/1 + 3U2){uI — %u\ — 2ul — SU 1 U 2 ) 

up to a constant. By writing the quadratic factor as Uq — 2 { 2 ui + 1/2)^? 
conclude that Resi,2,2(^0) F2) equals 

{uo + 1/1 - 31/2) (1/0 -ui~\- 31/2) (1/0 + 2V2U1 + V2u2){uq - 2 y/ 2 ui - V^i/2) 
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times a nonzero constant. Hint: If you are using Maple, let the resul- 
tant be res and use the command factor (res). Also, the command 
f actor (res, RootOf (x"2~2)) will do the complete factorization, 
d. The coefficients of the linear factors of Resi^ 2 , 2 (^ 0 ) ^ 2 ) give four 

points 

(1,1, -3), (1,-1, 3), (l,2v/2,V2), (1,-2^,-v^) 

in P^. Show that these points are the four solutions of the equations 
Fi = F 2 = 0. Thus the solutions in P^ are precisely the coefficients of 
the linear factors of Resi, 2 , 2 (-^o> ^ 2 )! 

In this exercise, all of the solutions lay in the affine space C P^ 
defined by xq = 1. In general, we will study the u-resultant from the affine 
point of view. The key fact is that when all of the multiplicities are one, 
the solutions of (5.3) can be found using Resi^di,...,dnifoi • • • ? /n)* 

(5.8) Proposition. Assume that /i = • • • = /n = 0 have total degrees 
bounded by d\, . . . ,dn, no solutions at 00 , and all solutions of multiplicity 
one. If fo — uo uiX\ + • • • + where uq, • • • , independent 

variables, then there is a nonzero constant C such that 

• • • 5 fn) (^ n fo(p)- 

Proof. Let C = Resdi,...,<i^ (Fi, . . . , Fn), which is nonzero by hypothesis. 
Since the coefficients of /o are the variables , Un, we need to work over 

the field K = C(uq, • • • , '^n) of rational functions in uq, • . • , Hence, in 
this proof, we will work over K rather than over C. Fortunately, the results 
we need are true over K, even though we proved them only over C. 
Adapting Theorem (3.4) to the situation of (5.2) (see Exercise 8) yields 

Resi,di,...,d„(/o, = C det(m/J, 

where m/o : A — > A is the linear map given by multiplication by fo on the 
quotient ring 

A = K[xi , . . . ,x„]/(/i, . . 

By Theorem (5.5), A is a vector space over K of dimension di • • • and 
Theorem (4.5) of Chapter 2 implies that the eigenvalues of are the 
values fo{p) for p e V(/i, . . . , /n). Since all multiplicities are one, there 
are di • • • such points p, and the corresponding values /o(p) are distinct 

since fo = uo-\-uiXi~\ \-UnXn and uo, ... ,Un are independent variables. 

Thus m/o has di • • • dn distinct eigenvalues /o(p), so that 

det(m/o) = JJ /o(p). 

P6V(/1 /„) 

This proves the proposition. □ 
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To see more clearly what the proposition says, let the points of 
be Pi for 1 < i < c?i • • • c/n- If we write each point as 
Pi = • • • , Clin) ^ then (5.7) implies 

fo{Pi) = + * • * H- 

SO that by Proposition (5.8), the -ix-resultant is given by 

dl • ’’dn 

(5.9) ReSi^di,...,dn(/0) • • • 5 fn) = C JJ (uq + anUi + • • • + nm^n)* 

i=l 

We see clearly that the -u-resultant is a polynomial in uq^ ^Un- Further- 
more, we get the following method for finding solutions of (5.3): compute 
^G^i,di,...,dnifo^ • • • > fn)i factor it into linear factors, and then read off the 
solutions! Hence, once we have the ^^-resultant, solving (5.3) is reduced to 
a problem in multivariable factorization. 

To compute the i^-resultant, we use Theorem (4.9). Because of our 
emphasis on /o, we represent the resultant as the quotient 

(5.10) Resi,di,...,dn(/oj • • • > /n) = • 

This is the formula we used in Exercise 2. In §4, we got the determinant Dq 
by working with the homogenizations Fi of the /^, regarding Xq as the last 

variable, and decomposing monomials of degree d = l + c?iH dn — n 

into disjoint subsets 5o, . . . , 5n. Taking xq last means that So consists of 
the dl • • • dn monomials 

(5.11) So = {xq^Xi^ • * • : 0 < < di - 1 for i > 0, YJi=o^i = ^}* 

Then Do is the determinant of the matrix Mo representing the system of 
equations (4.1). We saw an example of this in Exercise 2. 

The following exercise simplifies the task of computing tz-resultants. 

Exercise 3. Assuming that Dq ^ 0 in (5.10), prove that Dq does not 
involve . . . , ttn and conclude that Resi,d^,...,dn(/o> • • • ? fn) and Do differ 

by a constant factor when regarded as polynomials in C[iao? • • • ? 

We will write Do as Do(ttO) • • • ? ^n) to emphasize the dependence on 
txo, . . . , Un- We can use Do{uo , . . . , Un) only when Dq -=f^ 0, but since Dq is 
a polynomial in the coefficients of the /j. Exercise 3 means that generically, 
the linear factors of the determinant Do{uo, . . , ,Un) give the solutions of 
our equations (5.3). In this situation, we will apply the term u-resultant to 
both and Do{uo, u„). 

Unfortunately, the tz-resultant has some serious limitations. First, it is 
not easy to compute symbolic determinants of large size (see the discussion 
at the end of §4). And even if we can find the determinant, multivariable 
factorization as in (5.9) is very hard, especially since in most cases, fioating 
point numbers will be involved. 
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There are several methods for dealing with this situation. We will de- 
scribe one, as presented in [CM]. The basic idea is to specialize some of the 

coefficients in /o = uq -f UiX\ H h UnXn- For example, the argument of 

Proposition (5.8) shows that when the x^-coordinates of the solution points 
are distinct, the specialization u\ — ••• = Un-\ = 0, = — 1 transforms 

(5.9) into the formula 

d\ • ”dn 

(5.12) Resi,di,...,<i„(uo = C P («o - Oin), 

i=\ 

where ain is the Xn-coordinate of pi = {an, . . . , a^n) C V(/i, . . . , /n). This 
resultant is a univariate polynomial in uq whose roots are precisely the Xn- 
coordinates of solutions of (5.3). There are similar formulas for the other 
coordinates of the solutions. 

If we use the numerator Dq{uo, . . . , Un) of (5.10) as the iz-resultant, then 
setting ui — •••= Un = 0,Un = —I gives Dq{uo, 0, . . . , 0, —1), which 
is a polynomial in uq. The argument of Exercise 3 shows that generically, 
^o(^Oj 0, . . . , 0, —1) is a constant multiple Res(uo — /i? • • • ? /n)? so that 

its roots are also the Xn-coordinates. Since Do(^o? 0, . . . , 0, —1) is given by 
a symbolic determinant depending on the single variable uq, it is much 
easier to compute than in the multivariate case. Using standard techniques 
(discussed in Chapter 2) for finding the roots of univariate polynomials 
such as T)o(uo, 0, . . . , 0, — 1), we get a computationally efficient method for 
finding the Xn-coordinates of our solutions. Similarly, we can find the other 
coordinates of the solutions by this method. 

Exercise 4. Let Dq{uo, u\, U 2 ) be the determinant in Exercise 2. 

a. Compute Dq{uq, —1, 0) and Do{uq, 0, —1). 

b. Find the roots of these polynomials numerically. Hint: Try the Maple 
command f solve. In general, f solve should be used with the complex 
option, though in this case it’s not necessary since the roots are real. 

c. What does this says about the coordinates of the solutions of the equa- 
tions xf 4- ^2 = 10, Xi X 1 X 2 + 2x| = 16? Can you figure out what 
the solutions are? 

As this exercise illustrates, the univariate polynomials we get from the 
u-resultant enable us to find the individual coordinates of the solutions, 
but they don’t tell us how to match them up. One method for doing this 
(based on [CM]) will be explained in Exercise 7 at the end of the section. 
We should also mention that a different u-resultant method for computing 
solutions is given in [Can2]. 

All of the ix-resultant methods make strong genericity assumptions on 
the polynomials /o, • • • , /n- In practice, one doesn’t know in advance if a 
given system of equations is generic. Here are some of the things that can go 
wrong when trying to apply the above methods to non-generic equations: 
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• There might be solutions at infinity. This problem can be avoided by 
making a generic linear change of coordinates. 

• If too many coefficients are zero, it might be necessary to use the sparse 
resultants of Chapter 7. 

• The equations (5.1) might have infinitely many solutions. In the language 
of algebraic geometry, the projective variety V(Fi, . . . , might have 
components of positive dimension, together with some isolated solutions. 
One is still interested in the isolated solutions, and techniques for finding 
them are described in Section 4 of [Canl]. 

• The denominator Dq in the resultant formula (5.10) might vanish. When 
this happens, one can use the generalized characteristic polynomials 
described in §4 to avoid this difficulty. See Section 4.1 of [CM] for details. 

• Distinct solutions might have the same x^-coordinate for some i. The 
polynomial giving the x^-coordinates would have multiple roots, which 
are computationally unstable. This problem can be avoided with a 
generic change of coordinates. See Section 4.2 of [CM] for an example. 

Also, Chapter 4 will give versions of (5.12) and Proposition (5.8) for the 

case when /i = • • • = /n = 0 has solutions of multiplicity > 1. 



Hidden Variables 

One of the better known resultant techniques for solving equations is the 
hidden variable method. The basic idea is to regard one of variables as a 
constant and then take a resultant. To illustrate how this works, consider 
the aflSne equations we get from Exercise 2 by setting xo = 1: 

fi=xl+xl-10 = 0 

/2 = ^1 + X 1 X 2 -1- 2x2 - 16 = 0. 

If we regard X2 as a constant, we can use the resultant of §1 to obtain 

Res(/i, /a) = 2x| - 22x^ + 36 = 2(xa - 3)(x2 + 3)(x2 - a/ 2 )(x 2 + V2). 

The resultant is a polynomial in X2, and its roots are precisely the X2- 
coordinates of the solutions of the equations (as we found in Exercise 2). 

To generalize this example, we first review the affine form of the resultant. 
Given n + 1 homogeneous polynomials Go, , Gn of degrees do, . . . , dn in 
n + 1 variables xo, . . . , Xn, we get Resdo,...,d^{Go , . . . , Gn). Setting xo = 1 
gives 

• • • , ^n) Gi{\^ Xj, . . . , Xn), 

and since the gi and Gi have the same coefiicients, we can write the re- 
sultant as Resdo,...,di (^0, • • • , 9n)> Thus, n + 1 polynomials 5^0, • • • , in n 
variables xi, . . . , Xn have a resultant. It follows that from the affine point 
of view, forming a resultant requires that the number of polynomials be one 
more than the number of variables. 
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Now, suppose we have n polynomials /i, . . . , /n of degrees di, . . . , dri in 
n variables xi, . . . , Xn- In terms of resultants, we have the wrong numbers 
of equations and variables. One solution is to add a new polynomial, which 
leads to the tx-resultant. Here, we will pursue the other alternative, which 
is to get rid of one of the variables. The basic idea is what we did above: 
we hide a variable, say Xn^ by regarding it as a constant. This gives n 
polynomials /i, . . . , /^ in n — 1 variables xi, . . . , Xn-i, which allows us to 
form their resultant. We will write this resultant as 

The superscript x^ reminds us that we are regarding Xn as constant. 
Since the resultant is a polynomial in the coefficients of the /j, (5.14) is a 
polynomial in Xn- 

We can now state the main result of the hidden variable technique. 

(5.15) Proposition. Generically, Res^^ . . . , fn) is a polynomial 

in Xn whose roots are the Xn- coordinates of the solutions of (5.3). 

Proof. The basic strategy of the proof is that by (5.12), we already know 
a polynomial whose roots are the Xn-coordinates of the solutions, namely 

^n? /l? • • • ) /n)* 

We will prove the theorem by showing that this polynomial is the same as 
the hidden variable resultant (5.14). However, (5.14) is a polynomial in Xn, 
while Res{uo — /i, . . . , fn) is a polynomial in uq- To compare these two 

polynomials, we will write 

ResS":X(/i> •••./«) 

to mean the polynomial obtained from (5.14) by the substitution x^ = 'Uo- 
Using this notation, the theorem will follow once we show that 

•••’/») = - Xn, /l, . . . , /„). 

We will prove this equality by applying Theorem (3.4) separately to the 
two resultants in this equation. 

Beginning with Res(uo — x^, /i, . . . , /n)» first recall that it equals the 
homogeneous resultant Res{uoXo — Xn, Pi, . . . , Fn) via (5.2). Since uq is 
a coefiicient, we will work over the field C{uq) of rational functions in uq. 
Then, adapting Theorem (3.4) to the situation of (5.2) (see Exercise 8), we 
see that Res{uoXo — x^, Pi, , Pn) equals 

(5.16) R^^l,di,...,dn-ii~^ni F\^ . . . ,) Fn—l) det(?Tly^^ ), 

where — x^. Pi, . . . , Pn-i are obtained from uqXq — Xn, Pi, . . . , Pn-i by 
setting xo = 0, and : A Ais multiplication by fn in the ring 

A = C('u)[Xi, . . . , Xn ]/ (tA ^n, /l, • • • j /n)* 
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Next, consider Res^” • • • ? fn)^ and observe that if we define 



• • • 7 ^n— l) • • • 7 ^n— 17 ^^0)7 

then Res"^”"^^°(/i, . . . , fn) — Res(/i, . . . , /n). If we apply Theorem (3.4) 
to the latter resultant, we see that it equals 

(5.17) ReSd^,...,d„-i(Fi , . . .,Fn-i)‘^" det{m^J, 



where Fi is obtained from fi by first homogenizing with respect to xq and 
then setting xq = 0, and m? : ^ ^ is multiplication by fn in 

Jn 

^ ~ ^('^^o)[^l 7 • • • 7 ^n— 1]/ (/17 • • • 7 /n)* 

To show that (5.16) and (5.17) are equal, we first examine (5.17). We 
claim that if fi homogenizes to Fi, then Fi in (5.17) is given by 

(5.18) F i(xi, . . . , Xn—\^ — F i(0, X\, . . . , Xn—\t 0). 

To prove this, take a term of Fi, say 



- ao H h ttn = di. 

Since xq = I gives fi and Xn = uq then gives fi, the corresponding term 
in fi is 



•••a; 



n-l “0 



= CIXo"" • X 



ai 

1 



X 



On — 1 
n— 1 * 



When homogenizing fi with respect to 2:07 we want a term of total degree 
di in a;o7 . . . 7 Xn-i- Since cuq^ is a constant, we get 



CUn 



ao+On „Ol 



*'0 



‘'n-l 



= C 



^ 0 ” ■ --^T-iiuoxoT^- 



R follows that the homogenization of fi is Fi{xo , ... 7 a;n-i7 ^^0^0)7 and since 
Fi is obtained by setting xq = 0 in this polynomial, we get (5.18). 

Once we know (5.18), Exercise 12 of §3 shows that 

,...,dn—i ( Xn ’) F , Fn—l) ^^^^di,...,dn — i (-^1 7 * • • 7 -^n— l) 

since Fi{xi , . . . , Xn) = Fi{0,xi, . . . , Xn)- Also, the ring homomorphism 



C(i/o)[iZ:i 7 . . . 7 C(t/o)[iZ;i 7 . . • 7 Xn-l] 

defined by Xn ^ Uq carries fi to Ji. It follows that this homomorphism 
induces a ring isomorphism A = A (you will check the details of this in 
Exercise 8). Moreover, multiplication by fn and fn give a diagram 

A ^ A 



(5.19) 



mf„ 






m 



fn 



A 



A 
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In Exercise 8, you will show that going across and down gives the same map 
A ^ A ss going down and across (we say that (5.19) is a commutative 
diagram). Prom here, it is easy to show that det(m/^) = det(mj^), and it 
follows that (5.16) and (5.17) are equal. □ 

The advantage of the hidden variable method is that it involves re- 
sultants with fewer equations and variables than the ix-resultant. For 
example, when dealing with the equations /i = /2 = 0 from (5.13), the u- 
resultant Resi,2,2(/o> fu /2) uses the 10 x 10 matrix from Exercise 2, while 
Res2^2(/i» h) requires a 4 x 4 matrix. 

In general, we can compute Res^”(/i, . . . , fn) by Theorem (4.9), and as 
with the u-resultant, we can again ignore the denominator. More precisely, 
if we write 

then Dq doesn’t involve Xn- The proof of this result is a nice application of 
Proposition (4.6), and the details can be found in Exercise 10 at the end 
of the section. Thus, when using the hidden variable method, it suffices 
to use the numerator Dq — when /i, . . . , /n are generic, its roots give the 
ajn-coordinates of the affine equations (5.3). 

Of course, there is nothing special about hiding Xn — we can hide any of 
the variables in the same way, so that the hidden variable method can be 
used to find the Xi-coordinates of the solutions for any i. One limitation of 
this method is that it only gives the individual coordinates of the solution 
points and doesn’t tell us how they match up. 

Exercise 5. Consider the affine equations 

/l = xf + X2 + X3 - 3 

/2 = X? + X3 - 2 

/a = Xi + x| - 2x3. 

a. If we compute the u-resultant with /o = uo + + U 2 X 2 + u^xs, show 

that Theorem (4.9) expresses Resi,2,2,2(/oj /i> /2? /a) as a quotient of 
determinants of sizes 35 x 35 and 15 x 15 respectively. 

b. If we hide X3, show that Res^ 2,2(/i? /2> h) is a quotient of determinants 
of sizes 15 X 15 and 3x3 respectively. 

c. Hiding X3 as in part b, use (2.8) to express Res^ 2,2(/i> /s) /a) as the 
determinant of a 6 x 6 matrix, and show that up to a constant, the 
resultant is (xg -f- 2x3 — 3)^. Explain the signficance of the exponent 4. 
Hint: You will need to regard X3 as a constant and homogenize the fi 
with respect to xq. Then (2.8) will be easy to apply. 
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The last part of Exercise 5 illustrates how formulas such as (2.8) allow 
us, in special cases, to represent a resultant as a single determinant of 
relatively small size. This can reduce dramatically the amount of compu- 
tation involved and explains the continuing interest in finding determinant 
formulas for resultants (see, for example, [SZ]). 



Additional Exercises for §5 

Exercise 6. In the text, we claimed that generically, the solutions of n 
affine equations /i = • • • = /n = 0 have solutions of multiplicity one. 
This exercise will prove this result. Assume as usual that the fi come from 
homogeneous polynomials Fi of degree di by setting xq = 1. We will also 
use the following fact from multiplicity theory: if Fi = • • • = Fn = 0 has 
finitely many solutions and p is a solution such that the gradient vectors 

= 

are linearly independent, then p is a solution of multiplicity one. 

a. Consider the affine space consisting of all possible coefficients of the 
Fi. As in the discussion at the end of §2, the coordinates of are 
where for fixed z, the are the coefficients of F^. Now consider the set 
W C X X P^-i defined by 

W = {(ci,o;,p, ai, . . . ,an) € X P^ X : p is a 

nontrivial solution of Fq = • • • = F^^ = 0 and 
aiVFi(p) + • • • + an^^Fnip) = 0}. 

Under the projection map tt : x P" x P’^-i C^, explain why 

a generalization of the Projective Extension Theorem from §2 would 
imply that 7t{W) C is a variety. 

b. Show that 7r{W) C is a proper variety, i.e., find Fi, . . . , F^ such 

that (Fi, . . . , Fn) e \ 7t{W). Hint: Let Fi = lif=i(xi - jxo) for 

1 < i < n. 

c. By part c, we can find a nonzero polynomial G in the coefficients of the 
Fi such that G vanishes on 7 t(W). Then consider G • Res(Fi, . . . , F^). 
We can regard this as a polynomial in the coefficients of the fi. Prove 
that if this polynomial is nonvanishing at /i, . . . , /n, then the equations 
/o = • • • = /n = 0 have c?i • • • dn many solutions in C^, all of which 
have multiplicity one. Hint: Use Theorem (5.5). 

Exercise 7. As we saw in (5.12), we can find the Xn-coordinates of the 
solutions using Res(tx — Xn, /i, • . • , /n), and in general, the x^-coordinates 
can be found by replacing u — Xn by u — Xi in the resultant. In this exercise, 
we will describe the method given in [CM] for matching up coordinates to 
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get the solutions. We begin by assuming that weVe found the Xi- and X 2 - 
coordinates of the solutions. To match up these two coordinates, let a and 
(3 be randomly chosen numbers, and consider the resultant 

i?i, 2 (u) = Resi,di,...,d„(u - (axi + /3x2), /i, • . . , /«)■ 

a. Use (5.9) to show that 

di • “dn 

R\, 2 {y) — C [u — (aan + 0 cii 2 )), 

2=1 

where C' is a nonzero constant and, as in (5.9), the solutions are pi = 

(a^l , . . . , din)> 

b. A random choice of a and /3 will ensure that for solutions Pi^Pj^Pk^ we 
have aan + / 3%2 7 ^ ctCLki + f 3 ak 2 except when pi = pj = pk- Conclude 
that the only way the condition 

a • (an xi -coordinate) -h (3 • (an a: 2 -coordinate) = root of Ri, 2 {u) 

can hold is when the a: 1 -coordinate and a: 2 -coordinate come from the 
same solution. 

c. Explain how we can now find the first two coordinates of the solutions. 

d. Explain how a random choice of a, /?, 7 will enable us to construct a poly- 
nomial /?i, 2 , 3 (^) which will tell us how to match up the xa-coordinates 
with the two coordinates already found. 

e. In the affine equations /i = /2 = 0 coming from (5.13), compute 
Res(ti - xi, /i, / 2 ), Res(tx - X 2 , /i, / 2 ) and (in the notation of part a) 
Ri, 2 (^), using a = 1 and /3 = 2. Find the roots of these polynomials 
numerically and explain how this gives the solutions of our equations. 
Hint: Try the Maple command f solve. In general, f solve should be 
used with the complex option, though in this case it’s not necessary 
since the roots are real. 

Exercise 8 . This exercise is concerned with Proposition (5.15). 

a. Explain what Theorem (3.4) looks like if we use (5.2) instead of (3.3), 
and apply this to (5.16), (5.17) and Proposition (5.8). 

b. Show carefully that the the ring homomorphism 

• • • j ^n] ^ ^(^)[^ 1 ) • • • j ^n— 1 ] 

defined hy Xn ^ u carries fi to fi and induces a ring isomorphism 

A. 

c. Show that the diagram (5.19) is commutative and use it to prove that 
det(m/^) = det(m^^). 

Exercise 9. In this exercise, you will develop a homogeneous version of 
hidden variable method. Suppose that we have homogeneous polynomials 
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Fi, . . . , Fn in xq, . . . , such that 

We assume that Fi has degree di, so that fi has degree at most di. Also 
define 



• • • ) ^n— l) — • • • ) ^n— Ij 

As we saw in proof of Proposition (5.15), the hidden variable resultant can 
be regarded as the afiine resultant (/i, . . . , /n). To get a homoge- 

neous result^t, we homogenize fi with respect to xq to get a homogenous 
polynomial Fi(xo, . . . , Xn-i) of degree di. Then 

• • • , /n) ^^^di,...,dn(^^l^ • • • ? -^n)* 

a. Prove that 

Fi(xo, . . . , Xn-i) = Fi(xo, a?i, . . . , xqu). 

Hint: This is done in the proof of Proposition (5.15). 

b. Explain how part a leads to a purely homogeneous construction of the 
hidden variable resultant. This resultant is a polynomial in u. 

c. State a purely homogeneous version of Proposition (5.15) and explain 
how it follows from the affine version stated in the text. Also explain why 
the roots of the hidden variable resultant are Un/ao as p = (uq, . . . , On) 
varies over all homogeneous solutions of Fi = • • • = Fn = 0 in P’^. 

Exercise 10. In (5.20), we expressed the hidden variable resultant as a 
quotient of two determinants ±.Dq/Dq. If we think of this resultant as a 
polynomial in tx, then use Proposition (4.6) to prove that the denominator 
Dq does not involve u. This will imply that the numerator Do can be 
regarded as the hidden variable resultant. Hint: Byjihe previous exercise, 
we can write the hidden variable resultant as Res(Fi, . . .^Fn). Mso note 
that Proposition (4.6) assumed that Xn is last, while here Dq and Dq mean 
that xo is taken las^. Thus, applying Proposition (4.6) to the Fi means 
setting Xo = 0 in F^. Then use part a of Exercise 9 to explain why u 
disappears from the scene. 

Exercise 11. Suppose that /i, . . . , /n are polynomials of total degrees 
di, . . . ,dn in k[xi , . . . ,Xn]. 

a. Use Theorem (2.10) of Chapter 2 to prove that the ideal (/i, . . . , fn) is 
radical for /i, . . . , /„ generic. Hint: Use the notion of generic discussed 
in Exercise 6. 

b. Explain why Exercise 16 of Chapter 2, §4, describes a lex Grobner basis 
(assuming x^ is the last variable) for the ideal (/i, . . . , /n) when the fi 
are generic. 
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§6 Solving Equations via Eigenvalues 

In Chapter 2, we learned that solving the equations /i = • • • = /n = 0 can 
be reduced to an eigenvalue problem. We did this as follows. The monomials 
not divisible by the leading terms of a Grobner basis G for (/i, . . . , fn) give 
a basis for the quotient ring 

(6.1) A = C[a;i, . . . , Xn]/{fi, . • . , /n). 

(see §2 of Chapter 2). Using this basis, we find the matrix of a multiplication 
map rrifQ by taking a basis element and computing the remainder of 
x^fo on division by G (see §4 of Chapter 2). Once we have this matrix, its 
eigenvalues are the values fo{p) for p G V(/i, . . . , fn) by Theorem (4.5) 
of Chapter 2. In particular, the eigenvalues of the matrix for rrixi are the 
Xi“Coordinates of the solution points. 

The amazing fact is that we can do all of this using resultants! We first 
show how to find a basis for the quotient ring. 

(6.2) Theorem. If generic polynomials of total degree 

di, . . . , dn, then the cosets of the monomials 

? where 0 < < d^ — 1 for i = 1, . . . , n 

form a basis of the ring A of (6.1). 

Proof. Note that these monomials are precisely the monomials obtained 
from So in (5.11) by setting xq = 1. As we will see, this is no accident. 
By /i, . . . , /n generic, we mean that there are no solutions at oo, that all 
solutions have multiplicity one, and that the matrix Mu which appears 
below is invertible. 

Our proof will follow [ER] (see [PSl] for a different proof). There are 
di • • • dn monomials x^^ • • • x^^ with 0 < < d^ — 1. Since this is the 

dimension of A in the generic case by Theorem (5.5), it suffices to show 
that the cosets of these polynomials are linearly independent. 

To prove this, we will use resultants. However, we have the wrong number 
of polynomials: since /i, . . . , /n are not homogeneous, we need n + 1 poly- 
nomials in order to form a resultant. Hence we will add the polynomial 
fQ = uo + uiXi + • • • + UnXn^ where -uq, • • • , are independent vari- 
ables. This gives the resultant Resi,di,...,d„(/o, • • • ? /n)? which we recognize 
as the ix-resultant. By (5.10), this resultant is the quotient Dq/Dq, where 
Do = det(Mo) and Mq is the matrix coming from the equations (4.1). 

We first need to review in detail how the matrix Mq is constructed. 
Although we did this in (4.1), our present situation is different in two ways: 
first, (4.1) ordered the variables so that Xn was last, while here, we want 
xo to be last, and second, (4.1) dealt with homogeneous polynomials, while 
here we have dehomogenized by setting xo = I- Let’s see what changes this 
makes. 
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As before, we begin in the homogeneous situation and consider monomi- 
als of total degree d=lH-diH-*** + dn“^ (remember 

than the resultant is Resi,d^^...,ei„). Since we want to think of xq as last, we 
divide these monomials into n disjoint sets as follows: 

Sn = : |a| = d, x^ divides x^} 

Sn-i = {x^ : \oi\ = d, x^ doesn’t divide but x^~-l does} 



Sq = {x'^ : |a| = d, x^ ^ . . . , don’t divide x^ but xq does} 

(remember that do = 1 in this case). You should check that Sq is precisely 
as described in (5.11). The next step is to dehomogenize the elements of 
Si by setting xq = 1. If we denote the resulting set of monomials as 
then 5 q U U • • • U 5^ consists of all monomials of total degree < d in 
xi, . . . , Xn. Furthermore, we see that Sq consists of the di • • • dn monomials 
in the statement of the theorem. 

Because of our emphasis on 5 q, we will use x^ to denote elements of Sq 
and x^ to denote elements of U • • • U 5^. Then observe that 

if x°" G 5q, then x" has degree < d — 1, 

if x^ G i > 0, then x"/xf" has degree < d — di. 

Then consider the equations: 

x^ fo = 0 for all x“ G Sq 
(x^/x^^)/i=0 forallx^GSi 



(x^/x^”) /n = 0 for all x^ G S!^. 

Since the x^ fo and x^/xf* fi have total degree < d, we can write these 
polynomials as linear combinations of the x“ and x^. We will order these 
monomials so that the elements x" G Sq come first, followed by the 
elements x^ G U • • • U 5^. This gives a square matrix Mq such that 





/x“i\ 




fo \ 










Mo 




= 


fl 




to 




fl 








\ / 



where, in the column on the left, the first two elements of Sq and the first 
two elements of S[ are listed explicitly. This should make it clear what the 
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whole column looks like. The situation is similar for the column on the 
right. 

For p G V(/i,...,/n), we have fi{p) = •••== /^(p) = 0. Thus, 
evaluating the above equation at p yields 




To simplify notation, we let p" be the column vector . . .)^ given 

by evaluating all monomials in Sq at p (and T means transpose). Similarly, 
we let be the column vector given by evaluating all 

monomials in U • • • U 5^ at p. With this notation, we can rewrite the 
above equation more compactly as 

(6.3) = 

The next step is to partition Mq so that the rows and columns of Mq 
corresponding to elements of Sq lie in the upper left hand corner. This 
means writing Mq in the form 

Mnj' 

where Mqq is a p x p matrix for p = di • • • dn? and Mu is also a square 
matrix. With this notation, (6.3) can be written 

, /Moo Moi\ _ //o(p)p“^ 

(mio MnjUv“V 0 )■ 

By Lemma 4.4 of [Emil], Mu is invertible for most choices of /i, . . . , /n- 
Note that this condition is generic since it is given by det(Mn) ^ 0 and 
det(Mii) is a polynomial in the coefficients of the fi. Hence, for generic 
/i, . . . , /n, we can define the p x p matrix 

(6.5) M = Moo ~ MqiM^^ Miq. 

Note that the entries of M are polynomials in uq^ ... ,Un since these vari- 
ables only appear in Moo and Moi. If we multiply each side of (6.4) on the 
left by the matrix 



fl -MoiMiV\ 
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then an easy computation gives 

f M foip) p“ \ 

\Mio MnJ\p>^J V 0 )■ 

This implies 

(6.6) Mp- = /o(p)p^ 

so that for each p G V(/i, . . . , /n), fo{p) is an eigenvalue of M with p"^ as 
the corresponding eigenvector. Since /o = uo~\-uiXi H \-UnXn, the eigen- 

values fo{p) are distinct for p G V(/i, . . . , /n). Standard linear algebra 
implies that the corresponding eigenvectors p" are linearly independent. 

We can now prove the theorem. Write the elements of Sq as , . • . , , 

where as usual p = di, . . . , and recall that we need only show that the 
cosets . . . , [x^^] are linearly independent in the quotient ring A. So 
suppose we have a linear relation among these cosets, say 

cilx"^^] H h = 0. 

Evaluating this equation at p G V(/i, . . . , fn) makes sense by Exercise 12 
of Chapter 2, §4 and implies that Cip"^ + • • • + c^p^^^ = 0. In the generic 
case, V(/i , ... ,fn) has p = di • • • dn points pi, . . . ,p^, which gives p 
equations 

ciPi ' + • • • + c^Pi"" = 0 



+ • • • + c^P^^ = 0. 

In the matrix of these equations, the zth row is (p"\ . . . ,pf^), which in 
the notation used above, is the transpose of the column vector pf obtained 
by evaluating the monomials in Sq at pi. The discussion following (6.6) 
showed that the vectors pf are linearly independent. Thus the rows are 
linearly independent, so ci = • • • = = 0. We conclude that the cosets 

. . . , are linearly independent. □ 

Now that we know a basis for the quotient ring A, our next task it to find 
the matrix of the multiplication map relative to this basis. Fortunately, 
this is easy since we already know the matrix! 

(6.7) Theorem. Let /i, . . • , /n be generic polynomials of total degrees 
di, . . . , dn, and let fo = iaq 4- UiXi UnXn. Using the basis of 

A = C[xi, . . . , Xn]/{fi, . . . , fn) from Theorem (6.2), the matrix of the 
multiplication map mf^ is the transpose of the matrix 

M = Moo — Mio 



from (6.5). 
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Proof. Let Mf^ = (rriij) be the matrix of m/o relative to the basis 
. . . , of A from Theorem (6.2), where // = di • • • dn- The proof 
of Proposition (4.7) of Chapter 2 shows that for p G V(/i, . . . , fn), we 
have 



/o(p)(p“S • .. . Mf„. 

Letting p“ denote the column vector (p“^ . . . ,p“^)^ as in the previous 
proof, we can take the transpose of each side of this equation to obtain 

fo(p) p“ = (/o(p)(p“S • • • 

p“, 

where (M/^)^ is the transpose of M/^. Comparing this to (6.6), we get 

(M/„f p“ = Mp“ 

for all p G V(/i, . . . , /„). Since /i, . . . , /n are generic, we have p points 
p G V(/i, . . . , /n), and the proof of Theorem (6.2) shows that the corre- 
sponding eigenvectors^" are linearly independent. This implies (M/^)^ = 
M, and then Mf^ = follows easily. □ 

Since /o = uq + u\Xi -f- • — h UnXn, Corollary (4.3) of Chapter 2 implies 
^fo ~ ^0 I "I" “^1 4“ * * * “f" '^n ^Xn ) 

where Mx^ is the matrix of rrixi relative to the basis of Theorem (6.2). By 
Theorem (6.7), it follows that if we write 

(6.8) Af = uq I Ui Afi 4" • • • + Uji Af^, 

where each Mi has con^ant entries, then Mf^ = implies that Afp. = 
(Afi)^ for all i. Thus M simultaneously computes the matrices of the n 
multiplication maps , • • • , rrix ^ . 

Exercise 1. For the equations 

/i = + X2 - 10 = 0 

f2= X\X2 + 2x2 - 16 = 0 

(this is the affine version of Exercise 2 of §5), show that M is the matrix 

f Uq Ui U2 0 ^ 

~ ^ 4ui Uq 0 Ui+ U2 

6U2 0 Uo Ui — U2 ’ 

^ 0 3ui H- 3 u 2 2ui — 2t^2 / 

Use this to determine the matrices Af^i and Mx^- What is the basis of 
C[xi, X 2 ]/(/i, / 2 ) in this case? Hint: The matrix Afo of Exercise 2 of §5 is 
already partitioned into the appropriate submatrices. 




§6. Solving Equations via Eigenvalues 127 



Now that we have the matrices we can find the -coordinates of 
the solutions of (5.3) using the eigenvalues methods mentioned in Chap- 
ter 2 (see especially the discussion following Corollary (4.6)). This still 
leaves the problem of finding how the coordinates match up. We will follow 
Chapter 2 and show^ow the right eigenvectors of or equivalently, the 
left eigei^ctors of M = (M/^)^, give the solutions of our equations. 

Since M involves the variables uq, • . • , we need to specialize them 
before we can use numerical methods for finding eigenvectors. Let 

fo = Co + CiXi + 1 - CnXn, 

where cq, . . . , are constants chosen so that the values fo{p) are distinct 
for p G V(/i, . . . , /n). In practice, ^is can be achieved by making fan- 
dom choice of Co, ..., Cn. If we let M' be the matrix obtained fron^M by 
letting Ui = Ci, then (6.6) shows that p" is a left eigenvector for M' with 
eigenvalue /o(p). Since we have fi = d\ - • • dn distinct eigenvalues in a vec- 
tor space of the same dimension, the corresponding eigenspaces all have 
dimension 1. 

To find the solutions, suppose tha^weVe used a standard numerical 
method to find an eigenvector v of M'. Since the eigenspaces all have 
dimension 1, it follows that v = \ for some solution p G V(/i, . . . , /n) 
and nonzero constant A. This means that whenever is a monomial in 
5 q, the corresponding coordinate of v is Ap". The following exercise shows 
how to reconstruct p from the coordinates of the eigenvector v. 

Exercise 2. As ^ove, let p = (ai, . . . , an) G V(/i, . . . , fn) and let v be 
an eigenvector of M' with eigenvalue /o(p)- This exercise will explain how 
to recover p from v when d\, ... ,dn are all > 1, and Exercise 5 at the end 
of the section will explore what happens when some of the degrees equal 1. 

a. Show that 1, xi, . . . , G 5 q, and conclude that for some A 7^ 0, the 
numbers A, Aai, . . . , Aa^ are among the coordinates of v. 

b. Prove that aj can be computed from the coordinates of v by the formula 

aj = for j = 1 , . . . ,n. 

This shows that the solution p can be easily found using ratios of certain 
coordinates of the eigenvector v. 

Exercii^S. For the equations /i = /2 = 0 of Exercise 1, consider the 
matrix M' coming^from^o? ^2? ^3) = (0, 1,0, 0). In the notation of 
(6.8), this means M' — M\ = (M^i)^. Compute the eigenvectors of this 
matrix and use Exercise 2 to determine the solutions of /i = /2 = 0. 

While the left eigenvectors of M relate to the solutions of /i = • • • = 
fn = 0, the right eigenvectors give a nice answer to the interpolation prob- 
lem. This was worked out in detail in Exercise 17 of Chapter 2, §4, which 
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applies without change to the case at hand. See Exercise 6 at the end of 
this section for an example. 

Eigenvalue methods can also be applied to the hidden variable resul- 
tants discussed earlier in this section. We will discuss this very briefly. 
In Proposition (5.15), we showed that the Xn-coordinates of the solutions 
of the equations /i = • • • = /n = 0 could be found using the resul- 
tant Res^J^ (/i, . . . , fn) obtained by regarding Xn as a constant. As we 
learned in (5.20), 



Res 



9 * * * 7^? 



(/l j • • ' ) fn) — d: 






and if Mq is the corresponding matrix (so that^o = det(Mo)), one could 
ask about the eigenvalues and eigenvectors of Mq. It^rns out that this 
is not quite the right question to ask. Rather, since Mq depends on the 
variable we write the matrix as 



(6.9) 



Mq — Aq "h Xfi Ai + • • • + ^liAi^ 



where each Ai has constant entries and Ai ^ 0. Suppose that Mq and the 
Ai are mxm matrices. If Ai is invertible, then we can deflne the generalized 
companion matrix 



C = 



0 

0 









0 






S 'I 



I 



where Im is the mxm identity matrix. Then the correct question to pose 
concerns the eigenvalues and eigenvectors of C. One can show that the 
eigenvalues of the generali^d companion matrix are precisely the roots of 
the polynomial Dq = det(Mo)) and the corresponding eigenvectors have a 
nice interpretation as well. Further details of this technique can be found 
in [Man2] and [Man3]. 

Finally, we should say a few words about how eigenvalue and eigenvector 
methods behave in the non-generic case. As in the discussion of i^-resultants 
in §5, there are many things which can go wrong. All of the problems listed 
earlier are still present when dealing with eigenvalues and- eigenvectors, and 
there are two new difflculties which can occur: 



• In working with the matrix Mq as in the projof of Theorem (6.2), it can 
happen that Mu is not invertible, so that M = Mqo — MoiM^i Miq 
doesn’t make sense. 

• In working with the matrix Mq as in (6.9), it can happen that the leading 
term Ai is not invertible, so that the generalized companion matrix C 
doesn’t make sense. 
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Techniques for avoiding both of these problems are described in [Emi2], 
[Manl], [Man2] and [Man3]. 

Exercise 4. Express the 6x6 matrix of part c of Exercise 5 of §5 in the 
form Ao -f X 3 A 1 -h X3A2 and show that A 2 is not invertible. 

The idea of solving equations by a combination of eigenvalue methods 
and resultants goes back to the work of Auzinger and Stetter [AS]. This has 
now become an active area of research, not only for the resultants discussed 
here but also for the sparse resultants to be introduced in Chapter 7. 



Additional Exercises for §6 

Exercise 5. This exercise will explain how to ^cover the solution p = 
(ai, . . . , On) from an eigenvector v of the matrix M' in the case when some 
of the degrees di, ... ,dn are equal to 1. For simplicity, we will assume 
di = • • • = dfc = 1 and d{ > 1 for i > k. 

a. Show that So has no monomials involving xi, . . . , Then explain why 
the eigenvector v enables you to determine for 2 > fc but gives no 
information about ai, . . . , a^;. 

b. By substitiuting Xi = Oi for i > k into /i = • • • = /n = 0, show that in 
the generic case, we can find the remaining k coordinates of p by solving 
a system of k linear equations in k unknowns. 

Exercise 6. The equations /i = /2 = 0 from Exercise 1 have solutions 
Pi,P 2?P3 jjP 4 (they are listed in projective form in Exercise 2 of §5). Apply 
Exercise 17 of Chapter 2, §4, to find the polynomials gi, Qs, 94 such that 
9 i{pj) — I ifi = j and 0 otherwise. Then use this to write down explicitly 
a polynomial h which takes preassigned values Ai, A2, A3, A4 at the points 
Pi,P 2 jP 3?P4- Hint: Since the xi -coordinates are distinct, it suflSces to find 
the eigenvectors of Mx ^ . Exercise 1 will be useful. 




Chapter 4 

Computation in Local Rings 



Many questions in algebraic geometry involve a study of local properties of 
varieties, that is, properties of a single point, or of a suitably small neigh- 
borhood of a point. For example, in analyzing V(7) for a zero-dimensional 
ideal I C fc[xi, . . . , even when k is algebraically closed, it some- 
times happens that V (/) contains fewer distinct points than the dimension 
d = dimfc[a;i, . . . ^Xn]/I- In this situation, thinking back to the conse- 
quences of unique factorization for polynomials in one variable, it is natural 
to ask whether there is an algebraic multiplicity that can be computed 
locally at each point in V(7), with the property that the sum of the mul- 
tiplicities is equal to d. Similarly in the study of singularities of varieties, 
one major object of study are local invariants of singular points. These 
are used to distinguish different types of singularities and study their local 
structure. In §1 of this chapter, we will introduce the algebra of local rings 
which is useful for these both types of questions. Multiplicities and some 
first invariants of singularities will be introduced in §2. In §3 and §4, we 
will develop algorithmic techniques for computation in local rings parallel 
to the theory of Grobner bases in polynomial rings. 

In this chapter, we will often assume that k is an algebraically closed 
field containing Q. The results of Chapters 2 and 3 are valid for such fields. 



§1 Local Rings 

One way to study properties of a variety V is to study functions on the va- 
riety. The elements of the ring k[xi, . . . , Xn]/l{V) can be thought of as the 
polynomial functions on V. Near a particular point p e V v/e can also con- 
sider rational functions defined at the point, power series convergent at the 
point, or even formal series centered at the point. Considering the collec- 
tions of each of these types of functions in turn leads us to new rings which 
strictly contain the ring of polynomials. In a sense which we shall make 
precise as we go along, consideration of these larger rings corresponds to 
looking at smaller neighborhoods of points. We will begin with the follow- 
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ing example. Let V — and let p = (0, . . . , 0) be the origin. The single 
point set {p} is a variety, and I({p}) = (xi, . . . , Xn) C k[xi , . . . , x^]. Fur- 
thermore, a rational function f/g has a well-defined value at p provided 
9(P) f 0. 

(1.1) Definition. We denote by fc[xi, . . . , Xn](xi,...,xn) collection of all 
rational functions f/g of xi, . . . , Xn with g{p) 7^ 0, where p = (0 , . . . , 0). 

The main properties of A;[xi, . . . , Xn]{xi,...,xn) ^ follows. 

(1.2) Proposition. Let R = k[xi , . . . , Xn]{xi,...,xn)- Then 

a. R is a subring of the field of rational functions fc(xi, . . . , Xn) containing 
k[Xi, . . . ,Xn]. 

b. Let M = (xi, . . . , Xn) C R (the ideal generated ftp Xi, . . . , Xn in R). 
Then every element in R \ M is a unit in R (that is, has a multiplicative 
inverse in R). 

c. M is a maximal ideal in R, and R has no other maximal ideals. 

Proof. As above, let p = (0, . . . , 0). Part a follows easily since R is closed 
under sums and products in A;(xi, . . . , Xn). For instance, if fi/g\ and 72/5^2 
are two rational functions with pi(p), g 2 {v) ^ 0, then 

fi/9\ + 72/^2 = (/1P2 + f 29 i)/{ 9 i 92 )- 

Since pi(p) ^ 0 and p2(p) 7^ 0, pi(p) ■ 5^2 (p) 7^ 0. Hence the sum is an 
element of R. A similar argument shows that the product (/i/pi) • {f 2 / 92 ) 
is in R. Finally, since / = //I is in R for all / G fc[xi, . . . , Xn], the 
polynomial ring is contained in R. 

For part b, we will use the fact that the elements in M = (xi, . . . , x^) 
are exactly the rational functions f/g^R such that /(p) = 0. Hence if 
f/g ^ M, then /(p) ^ 0 and g{p) 7^ 0, and g/f is a multiplicative inverse 
for f/g in R. 

Finally, for part c, if AT ^ M is an ideal in R with M (Z N C R, then 
N must contain an element f/g in the complement of M. By part b, f/g 
is a unit in R, so I = (f/g){g/f) G iV, and hence N = R. Therefore M 
is maximal. M is the only maximal ideal in R, because it also follows from 
part b that every proper ideal 7 C i? is contained in M. □ 

Exercise 1. In this exercise you will show that if p = (ai, . . . , On) € k^ 
is any point and 

R = {f/9 • f,9 ^ k[xi, . . . , Xn], g{p) ^ 0}, 

then we have the following statements parallel to Proposition (1.2). 

a. 72 is a subring of the field of rational functions fc(xi, . . . , Xn). 

b. Let M be the ideal generated by xi — ai, . . . , Xn — fln in i2. Then every 
element in i2 \ M is a unit in R (i.e., has a multiplicative inverse in R). 
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c. M is a maximal ideal in i?, and R has no other maximal ideals. 

An alternative notation for the ring R in Exercise 1 is 

R = k[x\, • • • 5 ^n](xi— On) ’ 

where {xi — ai, . . . , — an) is the ideal I({p}) in k[xi , . . . , Xn], and in R 

we allow denominators that are not elements of this ideal. 

In the following discussion, the term ring will always mean a commuta- 
tive ring with identity. Every ring has maximal ideals. As we will see, the 
ones that give local information are the ones with the property given by 
part c of Proposition (1.2) above. 

(1.3) Definition. A local ring is a ring that has exactly one maximal 
ideal. 

The idea of the argument used in the proof of part c of the proposition 
also gives one general criterion for a ring to be a local ring. 

(1.4) Proposition. A ring R with a proper ideal M C R is a local ring 
if every element of R \ M is a unit in R. 

Proof. If every element of \ M is a unit in iZ, the unique maximal 
ideal is M. Exercise 5 below asks you to finish the proof. □ 

Definition (1.1) above is actually a special case of a general procedure 
called localization that can be used to construct many additional examples 
of local rings. See Exercise 8 below. An even more general construction 
of rings of fractions is given in Exercise 9. We will need to use that 
construction in §3 and §4. 

We also obtain important examples of local rings by considering functions 
more general than rational functions. One way such functions arise is as 
follows. When studying a curve or, more generally, a variety near a point, 
one often tries to parametrize the variety near the point. For example, the 
curve 



x^ + 2x + 1 /^ = 0 

is a circle of radius 1 centered at the point (—1,0). To study this curve 
near the origin, we might use parametrizations of several different types. 

Exercise 2. Show that one parametrization of the circle near the origin 
is given by 

-2t^ 2t 

"" ~ TT*2 ’ ^ “ 1 + • 

Note that both components are elements of the local ring k[t](^t)- 
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In this case, we might also use the parametrization in terms of 
trigonometric functions: 

X = ~1 cos t, y = sin t. 

The functions sin t and cos t are not polynomials or rational functions, but 
recall from elementary calculus that they can be expressed as convergent 
power series in t: 

oo 

Sint = ^(-l)'=f2'=+i/(2A: + 1)! 
fc =0 
oo 

cost = ^(-I)''f2''/(2A:)! . 
k=0 

In this case parametrizing leads us to consider functions more general than 
polynomials or rational functions. 

If fc = C or A; = M, then we can consider the set of convergent power 
series in n variables (expanding about the origin) 

k{xi , . . . , Xn} I ^ k and the series 

(1.5) ~ 

converges in some neighborhood of 0 G k^}. 

With the usual notion of addition and multiplication, this set is a ring (we 
leave the verification to the reader; see Exercise 3). In fact, it is not difficult 
to see that k{xi, . . . , Xn} is also a local ring with maximal ideal generated 
by xi, . . . ,Xn. 

No matter what field k is, we can also consider the set k[[xi, . . . , Xn]] of 
formal power series 

(1.6) /c[[xi, . . . , ^ 

where, now, we waive the condition that the series need converge. Alge- 
braically, a formal power series is a perfectly well defined object and can 
easily be manipulated — one must, however, give up the notion of evaluating 
it at any point of k^ other than the origin. As a result, a formal power series 
defines a function only in a rather limited sense. But in any case we can 
define addition and multiplication of formal series in the obvious way and 
this makes fc[[xi, . . . , Xn]] into a ring (see Exercise 3). Formal power series 
are also useful in constructing parametrizations of varieties over arbitrary 
fields (see Exercise 7 below). 

We can now make our comment that successive consideration of 
k[xi, then k{xi, x„}, then fc[[xi, . . . , x„]] corresponds 
to looking at smaller neighborhoods of points somewhat more precise. An 
element f/g G fc[xi, . . . , Xn]{xi,...,xn) defined not just at the origin but 
at every point in the complement of V(^f). The domain of convergence of 
a power series can be a much smaller set than the complement of a vari- 
ety. For instance, the geometric series 1 + x -f -h • • • converges to the 
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sum 1/(1 — x) e k[x](^x) only on the set of x with |a;| < 1 in fc = R or 
C. A formal series in k[[xi ^ . . . , Xn]] is only guaranteed to converge at the 
origin. Nevertheless, both k{xi ^ . . . , Xn} and k[[xi ^ . . . , Xn]] share the key 
algebraic property of k[xi , . . . , 

(1.7) Proposition. fc[[xi, . . . ,Xn]] is a local ring. If k = R or k = C 
then A:{xi, . . . , x^} is also a local ring. 

Proof. To show that fc[[xi, . . . ,Xn]] is a local ring, consider the ideal 
M == (xi, . . . , Xn) C k[[xi , . . . , Xn]] generated by Xi, . . . , Xn- If / ^ M, 
then f = Co -\-g with cq ^ 0, and g e M. Using the formal geometric series 
expansion 

+ + ■■■ + {-irr + ■■■, 

L T 

we see that 

1 _ 1 
Co + 9 co(l + g/co) 

= (i/co)(i - g/co + {g/cof + ••■)• 

In Exercise 4 below, you will show that this expansion makes sense as 
an element of fc[[xi, . . . , Xn]]. Hence / has a multiplicative inverse in 
A;[[xi, . . . , Xn]]. Since this is true for every / ^ M, Proposition (1.4) implies 
that fc[[xi, . . . , Xn]] is a local ring. 

To show that A:{xi, . . . , Xn} is also a local ring, we only need to show 
that the formal series expansion for 1 /(cq + g) gives a convergent series. 
See Exercise 4. □ 

All three types of local rings share other key algebraic properties with 
rings of polynomials. See the exercises in §4. By considering the power 
series expansion of a rational function defined at the origin, as in the proof 
above, we have A:[xi, . . . , Xn]{xi,...,xn) ^ k[[xi , . . . , Xn]]. In the case fc = R 
or C, we also have inclusions: 

A:[xx, • • • j ^n]{xi,...,Xn) ^ • • • j Xfi^ C fc[[xi, . . . , Xn]]. 

In general, we would like to be able to do operations on ideals in these 
rings in much the same way that we can carry out operations on ideals in 
a polynomial ring. For instance, we would like to be able to settle the ideal 
membership question, to form intersections of ideals, compute quotients, 
compute syzygies on a collection of elements, and the like. We will return 
to these questions in §3 and §4. 

Additional Exercises for §1 

Exercise 3. The product operations in fc[[xi, . . . , Xn]] and fc{xi, . . . , x^} 
can be described in the following fashion. Grouping terms by total degree. 
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rewrite each power series 

f{x) = 

as En>0 fni^)^ where 

fn{x) = ^ CaX“ 

|a|=n 

is a homogeneous polynomial of degree n. The product of two series f{x) 

and g{x) is the series h{x) for which 

hn = fn90 + fn-l9l + * * * + fo9n- 

a. Show that with this product and the obvious sum, k[[xi , . . . , Xn]] is a 
(commutative) ring (with identity). 

b. Now assume fc = M or A: = C, and suppose f^9 E A:{xi, . . . , a^n}- From 
part a, we know that sums and products of power series give other formal 
series. Show that if / and g are both convergent on some neighborhood 
U of (0, . . . , 0), then f g and / • g are also convergent on U. 



Exercise 4. Let ft G (xi, . . . , Xn) C k[[xi, . . . , Xn]]- 

a. Show that the formal geometric series expansion 

, = l- h-\-h^-h^-\ 

1 Al 

gives a well-defined element of k[[xi , . . . , Xn]]- (What are homogeneous 
components of the series on the right?) 

b. Show that if h is convergent on some neighborhood of the origin, then 
the expansion in part a is also convergent on some (generally smaller) 
neighborhood of the origin. (Recall that 

-J— + + 



is convergent only for t satisfying |t| < 1.) 



Exercise 5. Give a complete proof for Proposition (1.4). 



Exercise 6. Let F be a field. A discrete valuation of F is an onto mapping 
V : F \ {0} — > Z with the properties that 

1. v{x + 2 /) < min{i;(x),u( 2 /)}, and 

2. v{xy) = v{x) 4- v{y). 

The subset of F consisting of all elements x satisfying v{x) > 0, together 
with 0, is called the valuation ring of v. 

a. Show that the valuation ring of a discrete valuation is a local ring. Hint: 
Use Proposition (1.4). 
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b. For example, let F = k{x) (the rational function field in one variable), 

and let / be an irreducible polynomial in k[x] C F, If g £ k(x), then 
by unique factorization in k[x]^ there is a unique expression for g of the 
form g = • n/d, where a G Z, and n^d £ k[x\ are not divisible by 

/. Let v{g) = a £ Z, Show that v defines a discrete valuation on k{x). 
Identify the maximal ideal of the valuation ring. 

c. Let F = Q, and let p be a prime integer. Show that if p G Q, then by 

unique factorization in Z, there is a unique expression for g of the form 
g = • n/d^ where a G Z, and n, d G Z are not divisible by p. Let 

v{g) = a G Z. Show that v defines a discrete valuation on Q. Identify 
the maximal ideal of this valuation ring. 

Exercise 7. (A Formal Implicit Function Theorem) Let /(x, y) £ fc[x, y] 
be a polynomial of the form 

fix, y) = 2/" + Ci(x)i/"“^ + h c„-i{x)y + Cn{x), 

where Ci{x) £ k[x]. Assume that /(O, p) = 0 has n distinct roots ai £ k. 

a. Starting from yf^\x) = a^, show that there is a unique an £ k such 

that yf^^{x) = + anx satisfies 

f(x, y^^\x)) = 0 mod {x^). 

b. Show that if we have a polynomial yf\x) = + anx -!-••• + a^x^^ 

that satisfies 

fix, yf’ix)) = 0 mod 

then there exists a unique G k such that 

= yf^ix) + 

satisfies 

fix,vf^^\x)) = 0 mod 

c. From parts a and b, deduce that there is a unique power series yi{x) £ 
k[[x]] that satisfies f{x, yi{x)) = 0 and yi{0) = o^. 

Geometrically, this gives a formal series parametrization of the branch of 
the curve f{x,y) passing through (0, a^): {x,yi{x)). It also follows that 
/(x, y) factors in the ring fc[[x]][y]: 

n 

fix, y) = 11 ( 2 / - yiix))- 

2=1 

Exercise 8. Let R be an integral domain (that is, a ring with no zero- 
divisors), and let P C i? be a prime ideal (see Exercise 8 of Chapter 1, §1 
for the definition, which is the same in any ring P). The localization of R 
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with respect to P, denoted Pp, is a new ring containing P, in which every 
element in P not in the specified prime ideal P becomes a unit. We define 

Pp = {r/s : r, s e R, s ^ P}, 

so that Pp is a subset of the field of fractions of P. 

a. Using Proposition (1.4), show that Pp is a local ring, with maximal 
ideal M = {p/s : p e P,s ^ P}. 

b. Show that every ideal in Pp has the form Ip = {a/s : a e I, s ^ P}, 
where I is an ideal of P contained in P. 

Exercise 9. The construction of Pp in Exercise 8 can be generalized in 
the following way. If P is any ring, and S' C P is a set which is closed under 
multiplication (that is si,S 2 ^ S implies s\ • S 2 G S), then we can form 
“fractions” a/s, with a E R, s e S, We will say two fractions a/s and b/t 
are equivalent if there is some u E S such that u{at — bs) = 0 in P. We 
call the collection of equivalence classes for this relation S~^P. 

a. Show that forming sums and products as with ordinary fractions give 
well-defined operations on S~^R. 

b. Show that S~^P is a ring under these sum and product operations. 

c. If P is any ring (not necessarily an integral domain) and P C P is a 
prime ideal, show that S = R \ P is closed under multiplication. The 
resulting ring of fractions S~^R is also denoted Pp (as in Exercise 8). 

Exercise 10. Let P = k[xi ^ . . . , and I = {/i, . . . , fm) be an ideal in 
P. Let M = {xi, , Xn) be the maximal ideal of polynomials vanishing 
at the origin and suppose that I G M. 

a. Show that the ideal M/I generated by the cosets of xi, . . . , in R/I 
is a prime ideal. 

b. Let IRm denote the ideal generated by the fi in the ring Rm, and 
let (R/I) M/I be constructed as in Exercise 8. Let r/s E Rmi let [r], [s] 
denote the cosets of the numerator and denominator in R/I, and let [r/s] 
denote the coset of the fraction in Rm/ IRm- Show that the mapping 

if : Rm/IRm {R/I)m/i 
[r/s] [r]/[s] 

is well-defined and gives an isomorphism of rings. 

Exercise 11. Let P = k[xi, . . . ,Xn]{xi,...,xn)' Show that every ideal / C 
P has a generating set consisting of polynomials /i, . . . , /s E k[xi, . . . , Xn]- 

Exercise 12. (Another interpretation of k{xi, . . . ,Xn}) If fc = R (re- 
spectively C) then each element of k{x\, . . . ,Xn} converges in some 
neighborhood of the origin in R^ (respectively C’^), hence defines an ana- 
lytic function on that neighborhood. We say that two analytic functions, 
each defined on some neighborhood of the origin, are equivalent if there is 
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some (smaller) neighborhood of the origin on which they equal. An equiv- 
alence class of analytic functions with respect to this relation is called a 
germ of an analytic function (at the origin). 

a. Show that the set of germs of analytic functions at the origin is a ring 
under the usual sum and product of functions. 

b. Show that this ring can be identified with k{x \^ . . . , Xn} and that the 
maximal ideal is precisely the set of germs of analytic functions which 
vanish at the origin. 

c. Consider the function / : R — > M defined by 




if a; > 0 
if a; < 0. 



Show that / is on R, and construct its Taylor series, expanding 
at a = 0. Does the Taylor series converge to f{x) for all x in some 
neighborhood of 0 G R? 



If fc = R, the example given in part c shows that the ring of germs of 
infinitely differentiable real functions is not equal to fc{a;i, . . . , a;n}- On 
the other hand, it is a basic theorem of complex analysis that a complex 
differentiable function is analytic. 



§2 Multiplicities and Milnor Numbers 

In this section we will see how local rings can be used to assign local 
multiplicities at the points in V (/) for a zero-dimensional ideal I. We will 
also use local rings to define the Milnor and Tjurina numbers of an isolated 
singular point of a hypersurface. 

To see what the issues are, let us turn to one of the most frequent com- 
putations that one is called to do in a local ring, that of computing the 
dimension of the quotient ring by a zero-dimensional ideal. In Chapter 2, we 
learned how to compute the dimension of . . . , when / is a zero- 
dimensional polynomial ideal. Recall how this works. For any monomial 
order, we have 

dimA:[a:i, . . . ,Xn]// = dimA;[a;i, . . . , a;n]/(LT(/)}, 

and the latter is just the number of monomials x^ such that x^ ^ (lt(/)). 
For example, if 

I = {x^ +x^,y^) C fc[x,y], 

then using the lex order with y > x for instance, the given generators form 
a Grobner basis for I. So 

dim k[x,y]/ 1 = dim fcfrr, 2 /]/(lt(/)) = dimk[x,y]/{x^^y‘^) = 6. 

The rightmost equality follows because the cosets of 1, x, y, xy^ x’^y form 
a vector space basis of k[x, y^)- The results of Chapter 2 show that 
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there are at most 6 common zeros of and in In fact, from the 

simple form of the generators of 1 we see there are precisely two distinct 
points in V(/): (—1,0) and (0,0). 

To define the local multiplicity of a solution of a system of equations, 
we use a local ring instead of the polynomial ring, but the idea is much 
the same as above. We will need the following notation. If I is an ideal 
in k[xi , . . . , Xn], then we sometimes denote by Ik[xi ^ . . . , Xn]{xi,...,xn) 
ideal generated by I in the larger ring k[xi , . . . , Xn]{xi,...,xn)' 

(2.1) Definition. Let / be a zero-dimensional ideal in k[xi ^ . . . , so 
that V(7) consists of finitely many points in and assume that (0, 0, ..., 0) 
is one of them. Then the multiplicity of (0, 0, . . . , 0) as a point in V (7) is 

dim/j k[x\^ • • • , ^n]{xi,...,Xn) / • • • ) ^n]{xi,...,Xn) ‘ 

More generally, if p = (ai, . . . , a^) G V(7), then the multiplicity of p, de- 
noted m(p), is the dimension of the ring obtained by localizing k[xi , . . . , Xn] 
at the maximal ideal M = I({p}) = (xi — ai, . . . , Xn — CLn) corresponding 
to p, and taking the quotient: 

dimfc[a;i, . . . ,Xn]M/Ik[xi , . . . ,Xn]M- 

Since k[xi , . . . ,Xn]M is a local ring, it is easy to show that the quo- 
tient k[xi , . . . , Xn]M/Ik[xi , . . . , Xn]M is also local (see Exercise 6 below). 
The intuition is that since M is the maximal ideal of p G V(7), the ring 
k[xi , . . . , Xn]M/Ik[xi , . . . , Xn]M should refiect the local behavior of 7 at 

р. Hence the multiplicity m(p), which is the dimension of this ring, is a 
measure of how complicated 7 is at p. 

We can also define the multiplicity of solution p of a specific system 
/i = • • • = /yj = 0 of n equations in n unknowns, provided that p is an 
isolated solution (that is, there exists a neighborhood of p in which the 
system has no other solutions). From a more sophisticated point of view, 
this multiplicity is sometimes called the local intersection multiplicity of 
the varieties V(/i) at p. 

Let us check Definition (2.1) in our example. Let R = k[x^y](^x,y) be 
the local ring of at (0, 0) and consider the ideal J generated by the 
polynomials -f- x^ and in R. The multiplicity of their common zero 
(0, 0) is dim i?/J. 

Exercise 1. Notice that = a;^(l + a;). 

a. Show that 1 + a; is a unit in i?, so 1/(1 H- x) G i2. 

b. Show that and generate the same ideal in i? as x^ + x^ and p^. 

с. Show that every element f e R can be written uniquely as / = p/(H- 
h), where g G fc[x, p] and h G (x, p) C fc[x, p]. 

d. Show that for each / G -R, the coset [/] G R/(x^, p^)R is equal to the 
coset [p(l — ft + ft^)], where p, ft are as in part d. 
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e. Deduce that every coset in R/ {x^ ^y^)R can be written as [a-\-bx-\-cy -\- 
dxy\ for some unique a, b^c^d E k. 

By the result of Exercise 1, 

dim i?/J = dimi?/(x^, = 4. 

Thus the multiplicity of (0, 0) as a solution of x"^ -{■ x^ = y^ = 0 is 4. 

Similarly, let us compute the multiplicity of (—1, 0) as a solution of this 
system. Rather than localizing at the prime ideal {x + l,y), we change 
coordinates to translate the point (—1,0) to the origin and compute the 
multiplicity there. (This often simplifies the calculations; we leave the fact 
that these two procedures give the same results to the exercises.) So, set 
X = X -{■ l^Y = y (we want X and F to be 0 when x = — 1 and y = 0) 
and let S = k[X,Y]^x,v)- Then = {X - 1)^ + (X - 1)^ = 

X^ — 2X^ + X and y^ = and we want to compute the multiplicity 
of (0, 0) as a solution of X^ — 2X^ -f- X = F^ = 0. Now we note that 
X3 _ 2X2 + X = X{1 - 2X + X2) and 1/(1 - 2X + X2) e 5. Thus, 
the ideal generated by X and F2 in S is the same as that generated by 
X^ — 2X + X and F2 and, therefore, 

dim S/{X^ - 2X2 Y‘^)S = dim 5/(X, Y‘^)S = 2. 

Again, the equality on the right follows because the cosets of 1, F are a basis 
of S/{X, F2). We conclude that the multiplicity of (—1, 0) as a solution of 
x^ + x2 = 2/2 = 0 is 2. 

Thus, we have shown that the polynomials x^ + x2 and y2 have two 
common zeros, one of multiplicity 4 and the other of multiplicity 2. When 
the total number of zeros is counted with multiplicity, we obtain 6, in 
agreement with the the fact that the dimension of the quotient ring of 
k[x,y] by the ideal generated by these polynomials is 6. 

Exercise 2. 

a. Find all points in V(x2 — 2x + t/2, x2 — 4x + 4i/^) C C2 and compute 
the multiplicity of each as above. 

b. Verify that the sum of the multiplicities is equal to 

dimC[x, y]/{x^ - 2x + y‘^^ x2 — 4x 4- 

c. What is the geometric explanation for the solution of multiplicity > 1 
in this example? 

Before turning to the question of computing the dimension of a quotient 
of a local ring in more complicated examples, we will verify that the total 
number of solutions of a system /i = • • • = /« = 0, counted with multiplic- 
ity, is the dimension of fc[xi, . . . , Xn]/I when k is algebraically closed and 
/ = (/i, . . . , /s) is zero-dimensional. In a sense, this is confirmation that 
our definition of multiplicity behaves as we would wish. In the following 
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discussion, if {pi, . . . ,Pm} is a finite subset of and Mi = I({pi}) is the 
maximal ideal of k[xi ^ . . . , corresponding to we will write 

k[xi, Xn]Mi = {f/g ■ g(pi) ^0} = Oi 

for simplicity of notation. 

(2.2) Theorem. Let I he a zero- dimensional ideal in k[xi, . . . ^Xn] (k 

algebraically closed) and letY{I) = Then, there is a iso- 

morphism between k[xi^ . . . ^Xn]/I and the direct product of the rings 
Ai = OijlOi, fori = 1, . . . , m. 

Proof. For each z = 1, . . . , m, there are ring homomorphisms 

(fi : k[xi, ...,Xn]-^Ai 

f [/]i? 

where [f]i is the coset of / in the quotient ring Oi/IOi. Hence we get a 
ring homomorphism 

(f : k[xi, . . . ,Xn] Ai X • • • X Am 

f ^ ([/]l5 • • • 7 [/]m)- 

Since f E I implies [f]i = 0 e Ai for all i, we have I C ker ((/?). 
So to prove the theorem, we need to show first that / = ker{(p) (by 
the fundamental theorem on ring homomorphisms, this will imply that 
im((^) = k[xi, , Xn]/I), and second that is onto. 

To prepare for this, we need to establish three basic facts. We use the 
notation f = g mod I to mean f — g e I. 

(2.3) Lemma. Let Mi = I(fe}) in k[x\y . . . , Xn]- 

a. There exists an integer d>\ such that C I. 

b. There are polynomials G k[xi, . . . ,Xn], i = l,...,m, such that 

^ niod /, CiCj = 0 mod I if i ^ j, and ef = mod I. 

c. If g £ k[xi , . . . , Xn] \ Mi, then there exists h £ k[xi , . . . , Xn] such that 
hg = ei mod I. 

Proof of the Lemma. Part a is an easy consequence of the Nullstel- 
lensatz, so we leave that part to the reader as Exercise 7 below. 

Turning to part b. Lemma (2.9) of Chapter 2 implies the existence of 
polynomials gi £ . . . , Xn] such that gi{pj) = 0 if i ^ j, and gi{pi) = 1 

for each i. Let 

(2.4) = gf)\ 

where d is as in part a. Expanding the second term on the right in (2.4) 
with the binomial theorem and canceling the I’s, we see that ej £ Mf for 
all j ^ i. On the other hand, (2.4) implies — 1 G Mf for all i. Hence for 
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each 2 , 

Cj — 1 = — 1 + ^ 6j 

3 

is an element of Mf, Since this is true for all - — 1 G . 

Because the Mi are distinct maximal ideals, Mi + Mj = fc[xi, . . . , 
whenever i ^ j. It follows that (see Exercise 8 

below). Hence Yj ^ j “ 1 ^ C I. The second statement in part 

b also follows from what we have said: G C I whenever 

i ^ j. The congruence e‘f = mod I follows from the other two statements 
in part b. See Exercise 9 below. 

For part c, by multiplying by a constant, we may assume g{pi) = 1. 
Then 1 — ^ G M^, and hence taking h = (1 + (1 — p) H h (1 — 

hg = h(l - (1 - g)) == (1 - (1 - g)‘‘')ei = e, - (1 - g^ei. 

Since (1 — gY G Mf and G Mf for all j ^ ais shown above, we have 
(1 — gYci G / by part a, and the lemma is established. □ 

We can now complete the proof of Theorem (2.2). Let / G ker((^), and 
note that that kernel is characterized as follows: 

ker{ip) = {f e k[xi , . . . , Xn] : [f]i = 0 for all i} 

= {f : f elOi for all i} 

= {/ : there exists gi ^ Mi with gif G /}. 

For each of the gi, by part c of the lemma, there exists some hi such that 
higi = 6i mod I, As a result, / • ^i9i = hi{gif) is an element of 
/, since each gif G I. But on the other hand, / • Y^i = / ' Si = 
/ mod I by part b of the lemma. Combining these two observations, we see 
that f G I. Hence ker(y?) C I. Since we proved earlier that / C ker((^), we 
have I = ker((^). 

To conclude the proof, we need to show that (p is onto. So let 
([ni/di], . . . , [rim/dm]) be an arbitrary element of Ai x • • • x Am, where 
rii,di G fc[xi, . . . , Xn], di ^ Mi, and the brackets denote the coset in Ai. By 
part c of the lemma again, there are hi G k[xi, . . . , Xn] such that hidi = 
6i mod I. Consider the polynomial F = YiLi hiTiiCi G k[xi, . . . , Xn]. It 
is easy to see that ^i{F) = [rii/di] for each i using part b of the lemma. 
Hence (p is onto. □ 

An immediate corollary of this theorem is the result we want. 

(2.5) Corollary. Let k be algebraically closed, and let I be a zero- 
dimensional ideal in k[xi , . . . , x^]. Then dim k[xi , . . . , Xn]/I is the number 
of points o/V(/) counted with multiplicity. Explicitly, ifpi,---,Pm the 
distinct points of V (/) and Oi is the ring of rational functions defined at 
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P% f th/GTh 

dim k[xi, Xn]/I = YT=i Oi/IOi = 

Proof. The corollary follows immediately from the theorem by taking 
dimensions as vector spaces over k. □ 

A second corollary tells us when a zero-dimensional ideal is radical. 

(2.6) Corollary. Let k be algebraically closed^ and let I be a zero- 
dimensional ideal in k[xi , . . . , Xn]- Then I is radical if and only if every 
p G V(/) has multiplicity m{p) = 1. 

Proof. If V(/) = {pi, . . . ,Pm}? then Theorem (2.10) of Chapter 2 shows 
that dimfc[xi, . . . ,Xn]/I > with equality if and only if I is radical. 
By Corollary (2.5), this inequality can be written ^ Since 

m{pi) is always > 1, it follows that YlTLi '^{Pi) ^ ^ is an equality if and 
only if all m{pi) = 1. □ 

We next discuss how to compute multiplicities. Given a zero-dimensional 
ideal I C k[xi , . . . , Xn] and a polynomial / G k[xi , . . . , Xn], let rrif be 
multiplication by / on k[xi , . . . , Xn\lT Then the characteristic polynomial 
det(m/ — ul) is determined by the points in V(7) and their multiplicities. 
More precisely, we have the following result. 

(2.7) Proposition. Let k be an algebraically closed field and let I be a 
zero- dimensional ideal in k[xi , . . . , Xn]- If f ^ k,[x \^ . . . , Xn], then 

det(m/ - ul) = (-I)** n 
pev(i) 

where d = dimfc[xi, . . . ,Xn]/I and mj is the map given by multiplication 
by f on k[xi , . . .,Xn]/L 

Proof. Let V (/) = {pi , . . . , Pm}- Using Theorem (2.2), we get a diagram: 

. . . , Xji\/ 1 =■ A-i X • • • X Affi 

m/ I I m/ 

k[xi, Xn]/I = AiX ■■■ X Am 

where m/ : Ai x • • • x Am Ai x • • • x Am is multiplication by / on each 
factor. This diagram commutes in the same sense as the diagram (5.19) of 
Chapter 3. 

Hence we can work with m/ : Ai x • • • x Am — > Ai x • • • x Am- If 
we restrict to m/ : Ai — > it suffices to show that det(m/ — ul) = 

(_l)m(pi)(i^ — f{pi))^^^^^. Equivalently, we must show that f{pi) is the 
only eigenvalue of m/ on 
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To prove this, consider the map (pi : k[xi , . . . , Xn] Ai defined in the 
proof of Theorem (2.2), and let Qi — ker((/9i). In Exercise 11 below, you 
will study the ideal which is part of the primary decomposition of I. In 
particular, you will show that V{Qi) = {pi} and that k[xi , . . . , Xn]/Qi = 
Ai. Consequently, the eigenvalues of m/ on Ai equal the eigenvalues of m/ 
on k[xi , . . . , Xn]/Qi^ which by Theorem (4.5) of Chapter 2 are the values 
of / on V(Qi) = {pi}. It follows that f{pi) is the only eigenvalue, as 
desired. □ 

The ideas used in the proof of Proposition (2.7) make it easy to determine 
the generalized eigenvectors of m/. See Exercise 12 below for the details. 

If we know the points Pi, . . . ,Pm of V(7) (for example, we could find 
them using the methods of Chapters 2 or 3), then it is a simple matter to 
compute their multiplicities using Proposition (2.7). First pick / so that 
/(Pi)) • • • j /(Pm) are distinct, and then compute the matrix of m/ relative 
to a monomial basis of . . . , Xn]/! as in Chapters 2 or 3. In typical 
cases, the polynomials generating I have coefficients in Q, which means 
that the characteristic polynomial det(m/ — ul) is in Q[u]. Then factor 
det(m/ — ul) over Q, which can easily be done by computer (the Maple 
command is factor). This gives 

det(m/ - ul) = /if ^ • /lf^ 

where hi,..., hr are distinct irreducible polynomials over Q. For each 
Pi G V(7), f{pi) is a root of a unique hj, and the corresponding exponent 
ruj is the multiplicity m{pi). This follows from Proposition (2.7) and the 
properties of irreducible polynomials (see Exercise 13). One consequence is 
that those points of V (7) corresponding to the same irreducible factor of 
det(m/ — ul) all have the same multiplicity. 

We can also extend some of the results proved in Chapter 3 about re- 
sultants. For example, the techniques used to prove Theorem (2.2) give 
the following generalization of Theorem (5.4) of Chapter 3 (see Exercise 14 
below for the details). 

(2.8) Proposition. Let ^ k[xi, . . . ,Xn] (k algebraically 

closed) have total degrees at most di,. . . ,dn and no solutions at oo. If 
fQ — uq-\- UiXi H — • -f UnXjif where uq, . . . ,Un are independent variables, 
then there is a nonzero constant C such that 

• • • ) /n) = C (uq Uiai Unaji) 

pev(h,...,u) 

where a point p G V(/i, . . . , fn) is written p = {a\, . . . , a^). 

This tells us that the u-resultant of Chapter 3, §5, computes not only 
the points of V(/i, . . . , /^) but also their multiplicities. In Chapter 3, we 
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also studied the hidden variable method, where we set Xn = u in the equa- 
tions /i = • • • = /n = 0 and regard u as n constant. After homogenizing 
with respect to xq, we get the resultant ReSxo,...,xn-i(^i? • • • > from 
Proposition (5.9) in Chapter 3, which tells us about the ajn-coordinates of 
the solutions. In Chapter 3, we needed to assume that the x^-coordinates 
were distinct. Now, using Proposition (2.8), it is easy to show that when 
/i, . . . , /n have no solutions at oo, 

/l j • • • j /n) ~ RoSxo,...,x„_i (-^1) • * • ) -^n) 

(2-9) =C n 

pev(/i,...,/n) 

where p e V(/i, . . . , fn) is written p — (ai, . . . , Un). See Exercise 14 for 
the proof. 

The formulas given in (2.9) and Proposition (2.8) indicate a deep relation 
between multiplicities using resultants. In fact, in the case of two equations 
in two unknowns, one can use resultants to define multiplicities. This is 
done, for example, in Chapter 8 of [CLO] and Chapter 3 of [Kir]. 

Exercise 3. Consider the equations 

/i = y" - 3 = 0 
/2 = 6y - a:^ + 9x, 
and let I = (/i, /a) C k[x, y], 

a. Show that these equations have four solutions with distinct x coordi- 
nates. 

b. Draw the graphs of /i = 0 and /2 = 0. Use your picture to explain 
geometrically why two of the points should have multiplicity > 1. 

c. Show that the characteristic polynomial of mx on C[x, y]/I is — lSx^-\- 
Slx^ - 108 = {u^ - 3)2(u2 - 12). 

d. Use part c and Proposition (2.7) to compute the multiplicities of the 
four solution points. 

e. Explain how you would compute the multiplicities using Res(/i, / 2 , y) 
and Proposition (2.8). This is the hidden variable method for com- 
puting multiplicities. Also explain the meaning of the exponent 3 in 
Res(/i,/ 2 ,x) = (j/2 _ 3)3. 

Besides resultants and multiplicities. Theorem (2.2) has other interest- 
ing consequences. For instance, suppose that a collection of n polynomials 
/i, . . . , /n has a single zero in fc^, which we may take to be the origin. Let 
I = (/i? • • • ) /n)- Then the theorem implies 

(2.10) k[xi , . . . , Xn]/I = k[xi , . . . , Xfi] • • • J ’ 

This is very satisfying, but there is more to the story. With the above 
hypotheses on /i, . . . , /n, one can show that most small perturbations of 
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/i» • • • j /n result in a system of equations with distinct zeros, each of which 
has multiplicity one, and that the number of such zeros is precisely equal to 
the multiplicity of the origin as a solution of /i = • • • = /n = 0. Moreover, 
the ring fc[xi, . . . , Xn]// turns out to be a limit, in a rather precise sense, 
of the set of functions on these distinct zeros. Here is a simple example. 

Exercise 4. Let fc = C so that we can take limits in an elementary sense. 
Consider the ideals It = {y — — t) where t G C is a parameter. 

a. What are the points in V(/t) for t ^ 0? Show that each point has 
multiplicity 1, so = fc for each i. 

b. Now let t — > 0. What is V(7o) and its multiplicity? 

c. Using the proof of Theorem (2.2), work out an explicit isomorphism 
between C[x, y]/It, and the product of the Ai for t ^ 0. 

d. What happens as t — > 0? Identify the image of a general / in C[a:, y]/Io^ 
and relate to the image of / in the product of Ai for t ^ 0. 

We also remark that we can compute multiplicities by passing to the for- 
mal power series ring or, in the cases fc = M or C, to the ring of convergent 
power series. More precisely, the following holds. 

(2.11) Proposition. Let I C fc[xi, . . . , Xn] be a zero- dimensional ideal 
such that the origin is a point ofV{I) of multiplicity m. Then 

m = dim k[xi^ • • • ? • • • J 

= dim fc[[xi, . . . , Xn]]/Ik[[xi , . . . , x^]]. 

Ify moreover, fc = R or C, that we can talk about whether a power series 
converges, then 



m = dimfc{iCi, . . . , Xn}IIk{xi , . . . , Xn} 



as well 

To see the idea behind why this is so, consider the example we looked at 
in Exercise 1 above. We showed that dim k[x, y]{x,y ) / = 4 by 
noting that in fc[x, 2/](x,i/)j we have 

because 1/(1 4- x) G k[x,y](^x^yy As in §1, we can represent 1/(1 + x) as 
the formal power series 1 — x -f x^ — x^ + x^ — • • • G fc[[x, y]] and then 

(x^ + x^)(l — X 4- x^ - x^ 4- x"^ — • • •) = x^ 

in fc[[x, y]]. This shows that, in fc[[x, y]], (x^ 4- x^, = (x^, y^). It follows 

that 



dimk[[x,y]]/(x^,y‘^) = 4 
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(as before, the four monomials l,x,y,xy form a vector space basis of 
k[[x, y]]/{x^^ 2/^)). If fc = C, the power series 1 — x + — • • • 

is, convergent for x with |x| < 1, and precisely the same reasoning shows 
that {x^ -f x^, y^) = (x^, y^) in fc{x, y} as well. Therefore, 

dim k{x, y}/{x^, y^)k{x, y} = 4. 

It is possible to prove the proposition by generalizing these observations, 
but it will be more convenient to defer it to §4, so that we can make use of 
some additional computational tools for local rings. 

We will conclude this section by introducing an important invariant in 
singularity theory — ^the Milnor number of a singularity. See [Mil] for the 
topological meaning of this integer. One says that a holomorphic (=complex 
differentiable) function /(xi, . . . ,Xn) on has a singularity at a point 
p G if the n first order partial derivatives of / have a common zero at 
p. We will say singular point p is isolated if there is some neighborhood of 
p containing no other singular points of /. As usual, when considering a 
given singular point p, one translates p to the origin. If we do this, then 
the assertion that the origin is isolated is enough to guarantee that 

dimC{xi, . . . , Xn}/{df/dxi , . . . , df/dxn) < oo. 

Here, we are using the fact that we can view any complex differentiable 
function in a neighborhood of the origin as a formal power series at the 
origin (in fact, it is actually a convergent power series at the origin). 

(2.12) Definition. Let / G C{xi, . . . , Xn} have an isolated singularity 
at the origin. The Milnor number of the singular point, denoted p, is given 
by 

p = dimC{xi, . . . , Xn}{df/dxi , . . . , df/dXn)- 

In view of Proposition (2.11), if the function / is a polynomial, the Milnor 
number of a singular point p of / is just the multiplicity of the common 
zero p of the partials of /. 

Exercise 5. Each of the following /(x, p) G C[x, p] has an isolated 
singular point at (0, 0). For each, determine the Milnor number by 
determining 

fj, = dimC[[x,y]]/{df/dx,df/dy). 

a- f{x, y) =^y^ - x^ - x^. 
b. /(x, y) = y^ - x^. 
c- f(x, y) = y"^ - x^. 

In intuitive terms, the larger the Milnor number is, the more complicated 
the structure of the singular point is. To conclude this section, we mention 
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that there is a closely related invariant of singularities called the Tjurina 
number, which is defined by 

r = dim fc[[a:i, . . . , x„]]/(/, dfjdxi, dfidxn). 

Over any field A;, the Tjurina number is finite precisely when / has an 
isolated singular point. 



Additional Exercises for §2 

Exercise 6. If p C V(7) and M — I({p}) is the maximal ideal of p, then 
prove that k[x\^ . . . , Xu]m / , Xj^m is a local ring. Also show that 
the dimension of this ring, which is the multiplicity m(p), is > 1. Hint: Show 
that the map fcfxi, . . . , Xn]MlIk[x\^ . . . , Xn\M k given by evaluating a 
coset at p is a well-defined linear map which is onto. 

Exercise 7. Using the Nullstellensatz, prove part a of Lemma (2.3). 

Exercise 8. Let I and J be any two ideals in a ring R such that I J = R 
(we sometimes say I and J are comaximal). 

a. Show that IJ = I C\ J. 

b. From part a, deduce that if d > 1, then 7^ D = (7 D J)^. 

c. Generalize part b to any number of ideals 7i, . . . , 7^. if 7^ -h 7j = R 
whenever i ^ j. 

Exercise 9. Show that if are the polynomials constructed in (2.4) for 
part b of Lemma (2.3), then ef = mod 7. Hint: Use the other two 
statements in part b. 

Exercise 10. In this exercise, we will use Theorem (2.2) to give a new 
proof of Theorem (4.5) of Chapter 2. Let A{ be the local ring OijlOi as 
in the proof of Theorem (2.2). For / G k[x \^ . . . , let m/ : Ai Ai be 
multiplication by /. Also, the coset of / in Ai will be denoted [f]i. 

a. Prove that m/ is a vector space isomorphism if and only if [f]i G Ai is 
invertible, i.e., there is [g]i G Ai such that [/]i[p]i = [l]i. 

b. Explain why [f]i is in the maximal ideal of Ai if and only if f{pi) = 0. 

c. Explain why each of the following equivalences is true for a polynomial 
/ G k[xi , . . . , Xn] and A G C: A is an eigenvalue of rrif ^/-a is not 
invertible <=> [f — X\i G is not invertible <=>[/ — A] ^ is in the maximal 
ideal of Ai <=> f{p) = A. Hint: Use parts a and b of this exercise and 
part b of Exercise 1 from §1. 

d. Combine part c with the isomorphism fc[xi, . . . , Xn]/I = Ai x • • • x Am 
and the commutative diagram from Proposition (2.7) to give a new proof 
of Theorem (4.5) of Chapter 2. 
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Exercise 11. (Primary Decomposition) Let / be a zero-dimensional ideal 
with V(7) = {pi, . . . ,Pm}- This exercise will explore the relation be- 
tween the isomorphism A = k[xi^ . . . ^Xn]/I = ^4i x • • • x Am and the 
primary decomposition of I. More details on primary decomposition can 
be found in [CLO], Chapter 4, §7. We begin with the homomorphism 
(Pi : k[xi, . . . ,Xn] Ai defined by (p{f) = [/]* G Ai (this is the nota- 
tion used in the proof of Theorem (2.2)). Consider the ideal Qi defined 
by 



Qi = ker(y>i) = {/ 6 k[xi , . . . , a;„] : [/]» = [0], in Ai}. 

We will show that the ideals Qi, . . . , Qm give the primary decomposition 
of/. Let Mi = l({pi}). 

a. Show that I C Qi and that Qi = {f E k[xi , . . . , Xn] • there exists u in 
k[xi , . . . , Xn] \ Mi such that u • f e I}. 

b. If gi, ^ Qm are as in the proof of Theorem (2.2), show that for j ^ z, 
some power of gj lies in Qi . Hint: Use part a and the Nullstellensatz. 

c. Show that V(Qi) = {pi} and conclude that y/Ql = Mi. Hint: Use part 
b and the Nullstellensatz. 

d. Show that Qi is a primary ideal, which means that if fg G Qi, then 
either f e Qi ox some power of g is in Qi. Hint: Use part c. Also, Ai is 
a local ring. 

e. Prove that / = Qi D • • • fl Qm- This is the primary decomposition of I 
(see Theorem 7 of [CLO], Chapter 4, §7). 

f. Show that k[xi, . . . , Xn]/Qi = Ai. Hint: Show that (pi is onto using the 
proof of Theorem (2.2). 

Exercise 12. (Generalized Eigenspaces) Given a linear map T : V V, 
where U is a finite dimensional vector space, a generalized eigenvector of 
A G fc is a nonzero vector v e V such that (T — XI)'^{v) = 0 for some 
m > 1. The generalized eigenspace of of X is the space of the generalized 
eigenvectors for A. When k is algebraically closed, V is the direct sum of its 
generalized eigenspaces (see Section 7.1 of [FIS]). We will apply this theory 
to the linear map m/ : A — > A to see how the generalized eigenspaces of 
m/ relate to the isomorphism A = Ai x • • • x Am of Theorem (2.2). 

a. In the proof of Proposition (2.7), we proved that f{pi) is the only eigen- 
value of m/ : Ai ^ Ai. Use this to show that the generalized eigenspace 
of TUf is all of Ai. 

b. If f{pi ), . . . , f{Pm) are distinct, prove that the decomposition of A = 
k[xi, ... , Xn]fl into a direct sum of generalized eigenspaces for m/ is 
precisely the isomorphism A = Ai x • • • x Am of Theorem (2.2). 

Exercise 13. 

a. If h G Q[u] is irreducible, prove that all roots of h have multiplicity one. 
Hint: Compute hred- 
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b. Let h G Q[u] be irreducible and let A G C be a root of ft. If ^ G Q[u] 
and 5 f(A) = 0, prove that ft divides g. Hint: If GCD(ft, g) = 1, there are 
polynomials A, B G Qf-w] such that Ah + Bg = 1. 

c. If fti and ft 2 are distinct irreducible polynomials in QM, prove that fti 
and ft 2 have no common roots. 

d. Use parts a and c to justify the method for computing multiplicities 
given in the discussion following Proposition (2.7). 

Exercise 14. Prove of Proposition (2.8) and the formulas given in (2.9). 

Hint: Use Proposition (5.9) of Chapter 2. 

Exercise 15. 

a. Let ^ 1 , . . . , be homogeneous linear polynomials in . . . , Xn] with 

V(^i, . . . , £n) = {(Oj • • • j 0)}- Compute the multiplicity of the origin as 
a solution of £i = • • • = = 0. 

b. Now let /i, . . . , /n generate a zero-dimensional ideal in k[xi , . . . , Xn], 
and suppose that the origin is in V(/i, . . . , fn) and the Jacobian matrix 

J = {dfi/dxj) 



has nonzero determinant at the origin. Compute the multiplicity of the 
origin as a solution of /i = • • • = /n = 0. Hint: Use part a. 

Exercise 16. We say / G C[xi, . . . ,Xn] has an ordinary double point at 
the origin 0 in if /(O) = df/dxi{0) = 0 for all i, but the matrix of 
second order partial derivative is invertible at 0: 

det{d'^f/dxidxj) ^ 0. 

Find the Milnor number of an ordinary double point. Hint: Use Exercise 15. 

Exercise 17. Let / be a zero-dimensional ideal in k[xi, . . . , Xn] and let 
p = (ai, . . . , On) C V(7). Let Xi, . . . , be a new set of variables, and 
consider the set I C k[Xi, . . . , Xn] consisting of all /(Xi 4-ai, . . . , X„+an) 
where / G /. 

a. Show that / is an ideal in fc[Xi, . . . , Xn], and that the origin is a point 
in V(7). 

b. Show that the multiplicity of p as a point_in V (/) is the same as the 
multiplicity of the origin as a point in V(7). Hint: one approach is to 
show that 



(p : k[xi, . . . , Xn] ^ k[Xi, . . . , Xn] 

/(xi, . . . , Xn) /(Xi -H ai, . . . , Xn 4- an) 



defines an isomorphism of rings. 
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§3 Term Orders and Division in Local Rings 

When working with an ideal I C k[x\, . . . ,Xn], for some purposes 
we can replace I with its ideal of leading terms (lt(/)). For example, 
if I is zero-dimensional, we can compute the dimension of the quo- 
tient ring k[xi, . . . , Xn]/I by using the fact that dim k[xi^ . . . , Xn]/I = 
dimfc[xi, . . . , Xn]/(LT(/)}. The latter dimension is easy to compute since 
(lt(/)} is a monomial ideal — the dimension is just the number of monomi- 
als not in the ideal). The heart of the matter is to compute (lt(/)), which 
is done by computing a Grobner basis of I. 

A natural question to ask is whether something similar might work in a 
local ring. An instructive example occurred in the last section, where we 
considered the ideal I — {x’^ + x^^y^). For R = k[x,y](^x,y) or ^[[^>2/]] or 
k{x, y}, we computed dim R/IR by replacing I by the monomial ideal 

I = {x\y^). 

Note that I is generated by the lowest degree terms in the generators 
of I. This is in contrast to the situation in the polynomial ring, where 
dimk[x,y]/I was computed from (lt(/)) = using the lex leading 

terms. 

To be able to pick out terms of lowest degree in polynomials as leading 
terms, it will be necessary to extend the class of orders on monomials we 
can use. For instance, to make the leading term of a polynomial or a power 
series be one of the terms of minimal total degree, we could consider what 
are known as degree- anticompatible (or anti-graded) orders. By definition 
these are orders that satisfy 

(3.1) |al < \I3\ =^x‘^ > x^. 

We still insist that our orders be total orderings and be compatible with 
multiplication. As in Definition (2.1) of Chapter 1, being a total ordering 
means that for any a, /? exactly one of the following is true: 

x^ > x^ , x^ — x^, or x^ < x^. 

Compatibility with multiplication means that for any 7 £ Z>o» if x^ > x^, 
then Notice that property (3.1) implies that 1 > Xi for all 

i, 1 < i < n. Here is a first example. 

Exercise 1. Consider terms in k[x]. 

a. Show that the only degree-anticompatible order is the antidegree order: 

I > X > x^ > x^ . 

b. Explain why the antidegree order is not a well-ordering. 
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Any total ordering that is compatible with multiplication and that 
satifies 1 > for all i, 1 < i < n is called a local order. A degree- 
anticompatible order is a local order (but not conversely — see Exercise 2 
below). 

Perhaps the simplest example of a local order in n variables is degree- 
anticompatible lexicographic order, abbreviated alex^ which first sorts by 
total degree, lower degree terms preceding higher degree terms, and which 
sorts monomials of the same total degree lexicographically. 

(3.2) Definition (Anti-graded Lex Order). Let Z%. We say 

>aiex if 

n n 

|a| = < |/3| = 

Z =1 2=1 



or if 



|a| = |/3| and a:" >iex • 

Thus, for example, in A:[x, i/], with x > y, we have 

2 2 3 

1 ^ alex X ^ alex V ^ alex X ^ alex Xy ^ alex V ^ alex X ^ alex * * * • 

Similarly one defines degree-anticompatible reverse lexicographic, or 
arevlex^ order as follows. 

(3.3) Definition (Anti-graded Revlex Order). Let a, /3 E Z>q. We 

say X^ >arevlex X^ if 

|o'| <C 1^1) or |q| = |/3| and X ^revlex x^ . 



So, for example, we have 

2 

1 ^ arevlex X '^arevlex V ^ arevlex ^ ^ arevlex X ^ arevlex 

2 2 
xy ^ arevlex 2/ ^ arevlex XZ arevlex y^ ^ arevlex ^ ^ arevlex * * * • 



Degree-anticompatible and local orders lack one of the key properties 
of the monomial orders that we have used up to this point. Namely, the 
third property in Definition (2.1) from Chapter 1, which requires that a 
monomial order be a well-ordering relation, does not hold. Local orders 
are not well-orderings. This can be seen even in the one variable case in 
Exercise 1 above. 

In §4 of this chapter, we will need to make use of even more general 
orders than degree-anticompatible or local orders. Moreover, and somewhat 
surprisingly, the whole theory can be simplified somewhat by generalizing 
at once to consider the whole class of semigroup orders as in the following 
definition. 
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(3.4) Definition. An order > on or equivalently, on the set of 
monomials € Z^o in k[xi^ . . . ,Xn] or any of the local rings 

k{xi,. . . ,Xn}, OF fc[[xi, . . . , x„]], is Said to be a 
semigroup order if it satisfies: 

a. > is a total ordering on Z^o- 

b. > is compatible with multiplication of monomials. 

Semigroup orders include the monomial orders, which have the additional 
well-ordering property, as well as local orders and other orders which do 
not. Since the property of being a well-ordering is often used to assert that 
algorithms terminate, we will need to be especially careful in checking that 
procedures using general term orders terminate. 

It will be convenient to have a method for describing all possible semi- 
group orders. One such approach has been given by Robbiano (see [Rob]). 
Let f/ be a real matrix with n columns and some number m of rows. Let 
Ui denote the ith row of U. We can define an ordering >u on Z>q by 
setting a >u ^9 if there is some k such that, using the usual dot product, 
a • Ui = f3 • Ui for all i < k, but a • Uk > P • Uk- Every semigroup order 
can be described by giving an appropriate matrix U. The following exercise 
describes the necessary property of U and gives some examples. 



Exercise 2. 

a. Show that >t/ is compatible with multiplication for every matrix U, 

b. Show that >u is a total ordering if and only if ker(C/) fl Z>q = 

{( 0 ,..., 0 )}. 

c. Show that the lex monomial order with x\ > X 2 >• - •> Xn is the 
order >/, where I is the n x n identity matrix. 

d. Show that the alex order is the order >u defined by the matrix 



U = 




-1 

-1 



:n 



Vo 0 ••• -1/ 

e. Show that the arevlex order is the order >u for 





f-1 


-1 ... 


-1 


-1\ 




0 


0 ••• 


0 


-1 


u = 


0 


0 ••• 


-1 


0 




0 


_1 ... 


0 


0/ 



f. Find a local order that is not degree-anticompatible. Hint: What is it 
about the corresponding matrices that makes alex and arevlex degree- 
anticompatible, resp. local? 
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If / z= e fc[xi, . . . ,Xn] is a polynomial and > is a semi- 

group order, we define the multidegree, the leading coefficient, the leading 
monomial, and the leading term of / exactly as we did for a monomial 
order: 

multideg(/) = max{o: 6 Z5o : c„ ^ 0} 

LC(/) = Cmultideg(/) 

lm(/) = 

LT(/) = LC(/) • LM(/). 

In addition, each semigroup order > defines a particular ring of fractions 
in fc(xi, . . . , Xn) as in Exercise 9 of §1 of this chapter. Namely, given >, we 
consider the set 

S = {l+g £ k[xi , . . . , x„] : 5 = 0, or LT>(g) < 1}. 

S is closed under multiplication since if LT>(g) < 1 and LT>(g') < 1, then 
(1 + (/)(! + g’) = l + g + g' + gg', and LT{g + g' + gg') < 1 as well by the 
definition of a semigroup order. 

(3.5) Definition. Let > be a semigroup order on monomials in the ring 
k[xi,...,Xn] and let 5 = {1 + ^ : ur{g) < 1}. The localization of 
k[xi ^ . . . , Xn] with respect to > is the ring 

Loc>(fc[xi, . . . , Xn\) = S~'^k[xi, ...,Xn] = {//(I + g) I I + g £ S}. 

For example, if > is a monomial order, then there are no nonzero mono- 
mials smaller than 1 so 5 = {1} and Loc>(fc[xi, . . . , Xn]) = k[x \^ . . . , Xn]- 
On the other hand, if > is a local order, then since I > Xi for all z, 

{g : g = 0, or lt>(^) < 1} = {xi , . . . , Xn)> 

Hence, for a local order, we have that S is contained in the set of units in 
A:[xi , . . . , Xn] LoC> (^[Xi , . . . , (Z k\x\^ ... j Xn] 
in fact, by adjusting constants between the numerator and the denominator 
in a general f/h G k[xi^ . . . ,Xn]{xi,...,xn)i “ 

/'/(I -f g) for some 1 g E S. Hence if > is a local order, then 

LoC> , • • • , ^ Xn](^xi,...,Xn) 

The next two exercises give some additional, more general, and also 
quite suggestive examples of semigroup orders and their associated rings of 
fractions. 

Exercise 3. Using >aiex on the x-terms, and >/ex on the y-terms, define 
a mixed order >m%xed by >mixed x°^'y^' if either >ux y^ , or 

y0 = yp' and x“ >alex X°‘' ■ 

a. Show that >mixed is a semigroup order and find a matrix U such that 

^mixed—^U‘ 
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b. Show that >mixed is neither a well-ordering, nor degree-anticompatible. 

c. Let g e Show that 1 >mixed ^T^>mi.ed(9) if 

and only if g depends only on a;i, . . . , and is in (xi, . . . , Xn) C 
k\x \ , • • • j 

d. Let R = fc[xi, . . . ,Xn,2/i, . . . ,2/m]- Deduce that Loc>^.^^^(R) is the 
ring fc[xi, . . . , Xn]{xi,...,xn)[yi'> • • • ? 2/m]> whose elements can be written 
as polynomials in the yj , with coeflBcients that are rational functions of 
the Xi in . . . > ^n]{a;i,...,Xn)* 

Exercise 4. If we proceed as in Exercise 3 but compare the x- terms first, 
we get a new order defined by >mixed' by x^y^ >mixed' if either 

>aiex or x^ = x"^' and y^ >/ex 2/^'- 

a. Show that >mixed' is a semigroup order and find a matrix U such that 

^ mixed' ~'^U • 

b. Show that > mixed' is neither a well-ordering, nor degree-anticompatible. 

c. Which elements / G fc[xi, . . . , Xn, 2/i, • • • , 2/n] satisfy 1 >mixed' 

d. What is Loc>^.^^^, (fc[o;i, . . . , 2/i, . . . , 2/m])? 

Note that the order > mixed from Exercise 3 has the following elimination 
property: If x^ >mixed 2/^ > fben /?' = 0. Equivalently, any monomial 
containing one of the yj is greater than all monomials containing only the 
Xi. It follows that if the > mixed leading term of a polynomial depends only 
on the Xi, then the polynomial does not depend on any of the yj. We will 
return to this comment in §4 after developing analogs of the division algo- 
rithm and Grobner bases for general term orders, because this is precisely 
the property we need for elimination theory. 

Given any semigroup order > on monomials in fc[a;i, . . . , there is a 
natural extension of > to Loc>(fc[xi, • • • ? ^n])? which we will also denote 
by >. Namely, if 1 -f- G S' as in Definition (3.5), the rational function 
1/(1 + ^) is a unit in Loc> (k[xi, . . . , x^j), so it shouldn’t matter in defining 
the leading term of //(I + g). For any h G Loc>(fc[a;i, • • • ? ^n])? we write 
h = //(I + g) and define 

multideg(h) = multideg(/) 

Lc(h) = LC(/) 
hu{h) = lm(/) 
lt(/i) = lt(/). 

Exercise 5. Write A = k[xi , . . . , Xn]- Show that these notions are well- 
defined in Loc>(A) in the sense that if h = //(I g) = /V(l + l^ben 
multideg(h), Lc(h), lm(/i), lt(/i) will be the same whether / or /' is used 
to compute them. 
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In Exercise 8, you will show that if > is a local order, there is also a 
natural extension of > to fc[[xi, . . . , Xn]] (or k{xi , . . . , Xn} if fc = M or C). 
Moreover, in this case, the multidegree and leading term of h = f /{l+g) G 
k[xi , . . . ,Xn]{xi,...,xn) agree with what one obtains upon viewing h as a 
power series (via the series expansion of 1/(1 + ^)). 

Our goal in this section is to use general semigroup orders to develop 
an extension of the division algorithm in fc[xi, . . . ,Xn] which will yield 
information about ideals in jR = Loc>(fe[xi, . . . , Xn])- The key step in the 
usual polynomial division algorithm in is the reduction of one polynomial, 
/, by another, g. If lt(/) = m • lt(^), for some term m = ex", we define 

Red (f,g) = f -mg, 

and say that we have reduced f by g. The polynomial Red (/, g) is just what 
is left after the first step in dividing / by g — it is the first partial dividend. 
In the one variable case, the division algorithm in which one divides a 
polynomial / by another g is just the process of repeatedly reducing by 
g until either one gets 0 or a polynomial (i.e. dividend) that cannot be 
reduced by g (because its leading term is not divisible LT(5f)). In the case of 
several variables, the division algorithm, in which one divides a polynomial 
by a set of other polynomials, is just the process of repeatedly reducing 
the polynomial by members of the set and adding leading terms to the 
remainder when no reductions are possible. This terminates in the case 
of polynomials because successive leading terms form a strictly decreasing 
sequence, and such sequences always terminate for well-orderings. 

In the case of a local order on a power series ring, one can define Red (/, g) 
exactly as above. However, a sequence of successive reductions need no 
longer terminate. For example, suppose / = x and we decide to divide / 
by ^ = X — x^, so that we successively reduce by x — x^. This gives the 
reductions: 

/i = Red (/, g) = 
f 2 = Red{fi,g) = 



fn = Red{fn-i,g) = 

and so on, which clearly does not terminate. The difficulty, of course, is 
that X > x^ > x^ > . . . is a strictly decreasing sequence of terms under 
the antidegree order in fc[x](a,) or k[[x]] which does not terminate. 

We can evade this difficulty with a splendid idea of Mora’s. When divid- 
ing fi by g, for instance, we allow ourselves to reduce not just by g, but 
also by the result of any previous reduction. That is we allow reductions 
by / itself (which we can regard as the “zeroth” reduction), or by any of 
/i, . . . , /i_i. More generally, when dividing a set of polynomials or power 
series, we allow ourselves to reduce by the original set together with the 
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results of any previous reduction. So, in our example, where we are divid- 
ing f — X by g = x — the first reduction is fi = Red (/, g) = x^^. For 
the next reduction, we allow ourselves to reduce /i by / as well as g. One 
checks that 

Red (/i, /) = Red (x^, x) = 0, 

SO that we halt. Moreover, this reduction being zero implies x^ = xf. If we 
combine this with the equation / = l-g-^-x"^ which gives f\ = Red (/, g) — 
we obtain the relation f = g xf, or (1 — x)f = g. This last equation 
tells us that in k[x](^x)^ we have 



In other words, the remainder on division of / by is zero, as we might 
hope, since x and x — x*^ = x{l — x) generate the same ideal in or 

k[[x]]. 

Looking at the above example, one might ask whether it would always 
suffice to first reduce by g, then subsequently reduce by /. Sadly, this is not 
the case: it is easy to construct examples where the sequence of reductions 
does not terminate. Suppose, for example, that we wish to divide / = x-bx"^ 
by g = X + x^ x^. 

Exercise 6. Show that in this case too, / and g generate the same ideal 
in k[[x]] or fc[x](^). 

Reducing / by and then subsequently reducing the results by /o = / 
gives the sequence 

/i = Red (/, g) = x^ - x^ - x^ 

/2 = Red (/i, /) = -2x^ - x^ 
fs = Red (/ 2 , /) = - x^ 

/4 = Red(/3,/) = -3x" 
h = Red(UJ)=3x^ 
fe = Red (/s, /) = -Sx'^, 

and so on, which again clearly does not terminate. However, we get 
something which does terminate by reducing /s by f 4 i 

/s = Red(/ 4 ,/) = 3x® 
fe = Red {fe, fi) - 0. 

From this, we can easily give an expression for f: 

f = 1 • g {x - 2x^ + 2x^ - 3a;^) • / + (-x) • /s- 

Since /s can be expressed as a sum of a polynomial times g plus a polyno- 
mial times /, we can actually express / as a sum of a polynomial times g 
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plus a polynomial which vanishes at the origin times / by backsubstituting: 
X + = 1 • {x + x^) 4- {x — + 2x^ — ^x^) • {x + x^) 

— X • (— 3x^) 

—Zx^ = - 1 • (x + + x^) + (1 — a: + 2x^ — 2x^) • {x -f x^), 

which implies that 

X + = (1 4- x)(x 4- x^ + x^) 4- (— x^ — x^)(x 4- x^), 

or, in other words, 

/ = (unit) • g + (polynomial vanishing at 0) • /. 

This, of course, is what we want according to Exercise 6; upon transposing 
and solving for /, we have / = (unit) • g. 

Our presentation will now follow the recent paper [GGMNPPSS], which 
describes the algorithms underlying the latest version of the computer al- 
gebra system Singular. We will introduce this system in the next section. 
Since we deal with orders that are not well-orderings, the difficult part is 
to give a division process that is guaranteed to terminate. The algorithm 
and termination proof from [GGMNPPSS] uses a clever synthesis of ideas 
due to Lazard and Mora, but the proof is (rather amazingly) both simpler 
and more general than Mora’s original one. Using reductions by results of 
previous reductions as above, Mora developed a division process for poly- 
nomials based on a local order. His proof used a notion called the ecart of a 
polynomial, a measurement of the failure of the polynomial to be homoge- 
neous, and the strategy in the division process was to perform reductions 
that decrease the ecart. This is described, for instance, in [MPT]. Also see 
Exercise 11 below for the basics of this approach. Lazard had shown how to 
do the same sort of division by homogenizing the polynomials and using an 
appropriate monomial order defined using the local order. In implementing 
Singular, the authors of [GGMNPPSS] found that Mora’s algorithm could 
be made to work for any semigroup order. The same result was found in- 
dependently by Grabe (see [Gra]). The following Theorem (3.10) gives the 
statement. 

To prepare, we need to describe hazard’s idea mentioned above. We will 
specify the algorithm by using the homogenizations of / and the fi with 
respect to a new variable If ^ G fc[xi, . . . , Xn] is any polynomial, we will 
write g^ for the homogenization of g respect to t. That is, if ^ = 5^^ CaX^ 
and d is the total degree of then 

a 



(3.6) Definition. Each semigroup order > on monomials in the Xi extends 
to a semigroup order >' on monomials in xi, . . . , x^ in the following way. 
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We define >' if either a 4- laj > 6 + |/3| or a + |a| = 5 + \/3\, but 
x" > x^. 

In Exercise 12 below, you will show that >' is actually a monomial order 
on k[t,xi , . . . ,Xn]. 

By the definition of >', it follows that if x^ for some a^a\l3 with 

a = a' 4- |/3|, then 1 > x^. Hence, writing R = Loc>(fc[xi, . . . , Xn]), 

(3.7) > t“'x^ and a = o' 4- |)3| 1 4- is a unit in R, 

It is also easy to see from the definition that if g E: fc[xi, . . . , x^], then 
homogenization takes the >-leading term of g to the >'-leading term of g ^ — 
that is, LT>/(p^) = t“LT>(^), where a = d — |multideg>( 5 f)|. Conversely, if 
G is homogeneous in k[t, xi, . . . , Xn], then dehomogenizing (setting t = 1) 
takes the leading term lt>/(G) to LT>((gf) where g = G|t=i. 

Given polynomials /, /i, . • • , /s and a semigroup order >, we want to 
show that there is an algorithm (called Mora’s normal form algorithm) for 
producing polynomials h, u, ai, . . . , G fc[xi, . . . , x^], where u = I g 
and lt(^) < 1 (so u is a unit in Loc>(fc[xi, . . . , x^])), such that 

(3.8) U • f = Ul/l + • • • + CLsfs + 

where LT(ai)Lx(/i) < lt(/) for all z, and either h = 0, or Lx(h) < lx(/). 
It will be enough to be able to guarantee that Lx(/i) is not divisible by any 
of lx(/i),...,lx(/ 5 ). 

Several comments are in order here. First, note that the inputs 
/j /ij • • • > /sj the remainder h, the unit tz, and the quotients ai, . . . , in 

(3.8) are all polynomials. The equation (3.8) holds in fc[xi, . . . , x^], and as 
we will see, all the computations necessary to produce it also take place in a 
polynomial ring. We get a corresponding statement in Loc>(fc[xi, . . . , Xn]) 
by multiplying both sides by l/u: 

f = + • • • + (us/^)/s + (h/u). 

By Exercise 11 of §1, restricting to ideals generated by polynomials entails 
no loss of generality when we are studying ideals in fc[xi, . . . , Xn]{xi,...,xn) = 
Loc>(fc[xi, . . . , Xn]) for a local order >. But the major reason for restricting 
the inputs to be polynomials is that that allows us to specify a completely 
algorithmic (i.e. finite) division process. In fc[[xi, . . . , Xnj] or fc{xi, . . . , Xn}, 
even a single reduction-computing Red (/, 5 f)-would take infinitely many 
computational steps if / or were power series with infinitely many non- 
zero terms. 

Second, when dividing / by /i, ...,/« as in (3.8), we get a “remainder” 
h whose leading term is not divisible by any of the Lx(/i). In contrast, if 
we divide using the division algorithm of Chapter 1, §2, we get a remainder 
containing no terms divisible by any of the Lx(/j). Conceptually, there 
would be no problem with removing a term not divisible by any of the 
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lt{ fi) and continuing to divide. But as in the first comment, this process 
may not be finite. 

On the surface, these differences make the results of the Mora normal 
form algorithm seem weaker than those of the division algorithm. Even so, 
we will see in the next section that the Mora algorithm is strong enough 
for many purposes, including local versions of Buchberger’s criterion and 
Buchberger’s algorithm. 

Instead of working with the /, /i, fe, ai and u directly, our statement of 
the algorithm will work with their homogenizations, and with the order >' 
from Definition (3.6). Let F = and Fi = for i = 1, . . . , s. We first 
show that there are homogeneous polynomials ?7, Ai, . . . , An such that 

(3.9) UF = AiFi + • • • + AsFs + H, 
where lt>/(( 7) = for some a, 

a 4- deg F = deg Ai + deg Fi = deg H 

whenever Ai,H ^ 0, and no Lx(Fi) divides t^Lx(H) for any 6 > 0. Note 
that since U is homogeneous, if lt{U) — t®, then by (3.7) when we set 
t = 1, the dehomogenization it is a unit in Loc>(fc[xi, . . . , Xn])- 

(3.10) Theorem (Mora Normal Form Algorithm). Given homoge- 
neous polynomials F, Fi, . . . , in fc[t, xi, . . . , Xn] and the monomial order 
>' extending the semigroup order > on monomials in the Xi, there is 
an algorithm for producing homogeneous polynomials ff, {/, Ai, . . . , G 
k[xi , . . . , Xn] satisfying 

J7 . F = AiFi + • • • + AsFs 4- H, 

where lt{U) = for some a, 

a 4- deg(F) = deg(A*) 4- deg(F^) = deg{H) 

whenever A^, if ^ 0, and no lt(F^) divides t^LT{H) for any 6 > 0. 

Proof. We give below the algorithm for computing the remainder H. 
(The computation of the Ai and U is described in the correctness argument 
below.) An important component of the algorithm is a set L consisting of 
possible divisors for reduction steps. As the algorithm proceeds, this set 
records the results of previous reductions for later use, according to Mora’s 
idea. 

Input. F , (Fj, . . . , Fg) G ^ 1 ? • • • j ^n] 

Output: as in the statement of Theorem (3.10) 

M := {G G L : LT(G)|LT(t“ff) for some a} 

WHILE M ^ 0 DO 

SELECT G e L with a minimal 
IF a > 0 THEN 
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L:= LU {H} 

H := Red G) 

IF t\H THEN 

b := highest power of t dividing H 
H := H/t^ 

M := {G G L : hT{G)\LT{t^ H) for some a} 

We claim that the algorithm terminates on all inputs and correctly 
computes H as described in the statement of the theorem. 

To prove termination, in fc[t, xi, . . . ,Xn] let Mj denote the monomial 
ideal (lt(G) : G G L) after the jth pass through the WHILE loop {j > 
0). Since the polynomial ring k[t, xi, . . . , Xn] satisfies the ascending chain 
condition on ideals, there is some N such that Mn = M^v+i = • • *. This 
implies that no new elements are added to L after the iVth pass through 
the WHILE loop. (Otherwise, for some j > N the value of H at that 
point would be a polynomial such that the smallest a for which there exists 
Go G L with LT(Go)lt“LT(if) is strictly positive, but lt{H) G Mj = 
(lt(G) : G G L), a clear contradiction.) After the Nth. pass through the 
WHILE loop, the algorithm continues with a fixed set of divisors L, and at 
each step a reduction takes place decreasing the >'-leading term of H. Since 
>' is a monomial order on k[t, Xi,, .. ,Xn], the process must terminate as 
in the proof of the usual division algorithm. 

The correctness proof is by induction on the number of passes through 
the WHILE loop. At the start, we have U = 1, = 0 for all i, and H = F, 

so (3.9) is trivially satisfied. Assume we perform the test in the WHILE 
at the start of the jth pass with current values of Uj of U^Ai^j of A^, and 
Hj of H satisfying (3.9). Write ak for the exponent of t in ur{Uk) for each 
k < j. We may assume that for all k < j. 

If no lt(F^) divides any t^ur{H), then we are done. Otherwise some 
G G L satisfies LT{G)\hT{t^Hj) with a minimal. There are two possibilities 
to consider. Either G = Fi for some z, or G = Hj for some j < k. 

If G = Fi for some z, and a is chosen as in the statement of the algorithm 
above, then we have an equation 

F'UjF = F' A\^jF\ + • • • + AsjFs + t^Hj — MFi 4 * MF{ 
where M • lt(F^) = t^ur{Hj). Taking t/j+i = t°'Uj, 

A -I if J * 

^ . if i = i 

and Hj+i = f^Hj — MFi, we get a new expression of the form (3.8) in 
which LT(C/j+i) = . Hence Oj+i = aj + a, and the other required 

properties follow easily by the induction hypotheses. 

On the other hand, if G is one of the results Hk of a previous reduction, 
we have M • Lx(JJfc) — t°^ur{Hj) and by induction from the previous step 

(3.11) UkF = Ai^kFi + • • • + Ag^kFg + Hk, 
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as well as 

t^UjF = + . • • + t^AsjFs + eHj - MHk + MH^. 

Substituting from (3.11) into this last equation, we have a relation of the 
form (3.9) with Uj^i = t^Uj — MUk, Hj^i — t^Hj — MHk, = 

t^Aij - MAi^k- Since > LT(iZj), the leading term of IJj^i is 

^a+oj ^ gQ ag above. The other required properties 

follow by induction. □ 

Next, we claim that after dehomogenizing the result of the Mora 
algorithm, we obtain an expression (3.8) satisfying the required conditions. 

(3.12) Corollary. Let /, u,ai, fi^h be the dehomogenizations of the poly- 
nomials in an expression produced by Mora ^s algorithm. Then for all i such 
that ai ^ 0, LT{ai)hT{fi) < lt(/). Moreover j lt(/i) is not divisible by any 
LT(/i). 

Proof. See Exercise 14. □ 

Exercise 7. Carry out the Mora normal form algorithm dividing / = 
x‘^ y^ by fi = X — xy, f 2 = y^~b using the alex order in k[x, y]. 

In Loc>(A;[xi, . . . , ajn]), but (not always in k[[xi^ . . . ^Xn]]), we get a 
version of the Mora algorithm which doesn’t require / to be a polynomial. 

(3.13) Corollary. Let > be a semigroup order on monomials in the ring 
k[xi , . . . , Xn] and letR = Loc>(fc[a;i, . . . , Xn])- Let f G R and fi , ...,/« G 
k[xi , . . . , Xn]- There is an algorithm for computing /i, ai, . . . , as G R such 
that 

f = aifi + • • • + tts/s + h, 

where ur{ai)ur{fi) < lt(/) for all i, and either h = 0, or ur{h) < lt(/) 
and lt(/i) is not divisible by any o/lt(/i), . . . , lt(/s). 

Proof. If we write / in the form f/u' where /', a' G k[xi , . . . , Xn] and 
a' is a unit in jR, then dividing /' by /i , . . . , /s via homogenizing, applying 
Theorem (3.10), then dehomogenizing, gives 

u • f' = a'lfi + • • • + a'gfs + 

where u,h',a[, ... ,a^ are as in the theorem. Since the leading term of a 
unit is a nonzero constant (see Exercise 2), dividing a polynomial by a unit 
doesn’t effect the leading term (up to multiplication by a nonzero constant). 
Thus, dividing the above equation by the unit u u' gives 

/ = aifi -f • • • + tts/s H- h, 

where ai = a[/(uu'), h = /i'/ (W) clearly have the required properties. □ 
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In the next section, we will use the Mora normal form algorithm to extend 
Buchberger’s algorithm for Grobner bases to ideals in local rings. 



Additional Exercises for §3 

Exercise 8. Let R = k[xi , . . . , Xn]{xi,...,xn)^ ^ — k[[xi , . . . , Xn]], and let 
> be a local order on monomials in R and S. Let i : R ^ S denote 
the natural inclusion obtained by writing each element /i in i? in the form 
//(I + ^), then expanding 1/(1 + /i) in a formal geometric series. 

a. Show that multideg(h) = multideg(z(/i)). 

b. Deduce that 

LM>(h) = LM>(i(h)), LC>(/i) = LC>(i(h)), and lt>(/i) = LT>(i(h)). 

Exercise 9. In the Mora normal form algorithm (3.10), if, after dehomog- 
enizing, ft = 0, show that / belongs to the ideal generated by /i, ...,/« in 
the ring R = Loc>(A:[o:i, . . . , Xn]). Is the converse always true? 

Exercise 10. How should the Mora normal form algorithm (3.10) be ex- 
tended to return the quotients A{ and the unit U as well as the polynomial 
H7 Hint: Use the proof of correctness. 

Exercise 11. This exercise describes the way Mora based the original 
version of the normal form algorithm (for local orders) on the ecart of a 
polynomial. Let g ^ 0 E k[xi , . . . ,a;n], and write ^ as a finite sum of 
homogeneous, nonzero polynomials of distinct total degrees: g = gi, 
gi ^ 0 homogeneous, with deg( 5 fi) < • • • < deg(^fc). The order of de- 
noted ord(flf), is the total degree of g\. The total degree of denoted deg(^) 
is the total degree of gk- The ecart of g^ denoted E{g)^ is the difference of 
the degree of g and the order of g: 

E{g) = deg( 5 ) - ord( 5 ). 

By convention, we set £*(0) = — 1. Thus E{g) > —1 for all g. (The word 
ecart is French for “difference” or “separation” — clearly a good description 
of the meaning of J^(p)!) 

a. Let > be a local order and let / and g be two nonzero polynomials such 
that LT(flf) divides lt(/). Then show that 

E{ReA{f,g)) <mBx{E{f),E{g)). 

b. In the one variable case, part a gives a strategy that guarantees termi- 
nation of division. Namely, at each stage, among all the polynomials by 
which we can reduce, we reduce by the polynomial whose ecart is least. 
Show that this will ensure that the ecarts of the sequence of partial 
dividends decreases to zero, at which point we have a monomial which 
can be used to reduce any subsequent partial dividend to 0. 
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c. Apply this strategy, reducing by the polynomial with the smallest pos- 
sible ecart at each step, to show that g divides / in k[x](^x) in each of 
the following cases. 

1. g = x-{-x^ f = 2x^. Note that there is no way to produce 

a sequence of partial dividends with strictly decreasing ecarts in this 
case. 

2. g = X x^ x^^ f = X x^ x^ x"^. Note that after producing 
a monomial with the first reduction, the ecart must increase. 

Exercise 12. Let > be a semigroup order on monomials in fc[xi, . . . , Xn] 
and extend to >' on monomials in t, xi, . . . ,Xn as in the text: define 
t^x^ >/ fh^(3 either a \a\ > b \P\ or a \a\ = b + |/?|, but x^ > x^. 

a. Show that >' is actually a monomial order on k[t^ xi, . . . , Xn]. 

b. Show that if > = >u for an m x n matrix J7, then >' is the order >u' 
where U' is the (m + 1) x (n -f 1) matrix 

/I 1 ... l\ 

0 

: U 

VO / 

Exercise 13. Show that to prove correctness of the Mora normal form 
algorithm given in Theorem (3.10), it suffices to show that there are 
homogeneous polynomials 17, Ai, . . . , An such that 

UF = AiFi + . . . + AsFs + H, 

where lt(17) = t^ for some a, 

a -i- deg(F) = deg(Ai) + deg{Fi) = deg(17) 

whenever A^, if ^ 0, and no LT{Fi) divides t^hT(H) for any 6 > 0. 

Exercise 14. Prove Corollary (3.12) using the homogeneous polynomials 
produced by the Mora algorithm, as in the proof of Theorem (3.10). 

§4 Standard Bases in Local Rings 

In this section, we want to develop analogs of Grobner bases for ideals in 
on one of our local rings R = k[xi , . . . , Xn]{xi,...,xn)j ^ ~ k{xi , . . . , Xn}, 
or R = fc[[xi, . . . , Xn]]. Just as for well-orderings, given an ideal I in R, we 
define the the set of leading terms of /, denoted lt( 7), to be the set of all 
leading terms of elements of I with respect to >. Also, we define the ideal 
of leading terms of /, denoted (lt(/)), to be the ideal generated by the 
set lt( 7) in R. Also just as for ideals in polynomial rings, it can happen 
that 7 = (/i, . . . , /s) but (lt(7)) ^ (lt(/i), . . . lt(/ 5 )) for an ideal I C R. 
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By analogy with the notion of a Grobner basis, we make the following 
definition. 



(4.1) Definition. Let > be a semigroup order and let R be the ring of 
fractions Loc>(A;[a;i, . . . , Xn]) as in Definition (3.5), or R = k[[xi ^ . . . , Xn]] 
or k{xi , . . . , Xn}- Let / C R be an ideal. A standard basis of / is a set 
{gi,---, 9 t} C I such that (lt(/)) = (LT(ffi), . . . lt(5<)). 



In the literature, the term “standard basis” is more common than 
“Grobner basis” when working with local orders and the local rings 
R — ^[^1 j • • • j ^n\{xi,...,xn ) ’ • * • 5 k}x\^ • • * ) ^n} use that 

terminology here. 

Every nonzero ideal in these local rings has standard bases. As a result, 
there is an analog of the Hilbert Basis Theorem for these rings: every ideal 
has a finite generating set. The proof is the same as for polynomials (see 
Exercise 2 of Chapter 1, §3 and Exercise 4 below). Moreover, the Mora 
normal form algorithm — Theorem (3.10) — is especially well-behaved when 
dividing by a standard basis. In particular, we obtain a zero remainder if 
and only if / is in the ideal generated by the standard basis. See part a of 
Exercise 4. 

However, in order to construct algorithms for computing standard bases, 
we will restrict our attention once more to ideals that are generated in 
these rings by collections of polynomials. Most of the ideals of interest in 
questions from algebraic geometry have this form. 

Given polynomial generators for an ideal, how can we compute a stan- 
dard basis for the ideal? For the polynomial ring k[xi , . . . , Xn] and Grobner 
bases, the key elements were the division algorithm and Buchberger’s algo- 
rithm. Since we have the Mora algorithm, we now need to see if we can carry 
Buchberger’s algorithm over to the case of local or other semigroup orders. 
That is, given a collection /i, . . . , /« of polynomials, we would like to find 
a standard basis with respect to some local order of the ideal (/i, • • . , /«) 
they generate in a local ring R. More generally, one could also look for 
algorithms for computing standard bases of ideals in Loc>(fc[xi, . . . , Xn]) 
for any semigroup order. 

It is a rather pleasant surprise that the ingredients fall into place with 
no difficulty. First, the definition of 5-polynomials in this new setting is 
exactly the same as in fc[xi, . . . , Xn] (see Definition (3.2) of Chapter 1), 
but here we use the leading terms with respect to our chosen semigroup 
order. 

Next, recall that Buchberger’s algorithm consists essentially in forming 
5-polynomials of all elements in the input set F = {/i, . . . , of poly- 
nomials, finding remainders upon division by F, adding to F any nonzero 
remainders, and iterating this process (see §3 of Chapter 1). Since we have 
the Mora normal form algorithm, whose output is a sort of remainder on 
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division, we can certainly carry out the same steps as in Buchberger’s algo- 
rithm. As with any algorithm, though, we have to establish its correctness 
(that is, that it gives us what we want) and that it terminates. 

In the case of well-orders, Buchberger’s criterion (the statement that a 
finite set G is a Grobner basis if and only if the remainder upon division 
by G of every 5-polynomial formed from pairs of elements of G is 0 — 
see Chapter 1, §3) guarantees correctness. The first part of the following 
theorem gives an analog. 

(4.2) Theorem. Let S be a finite set of polynomials, let > be any semi- 
group order, and let I be the ideal in R = Loc>(fc[xi, . . . , Xn]) generated 
byG. 

a. (Analog of Buchberger^s Criterion) S = {^i, • • . , is a standard basis 

for I if and only if applying the Mora normal form algorithm given in 
Theorem (3.10) to every S -polynomial formed from elements of the set 
of homogenizations , . • • , yields a zero remainder. 

b. (Analog of Buchberger’s Algorithm) Homogenization, followed by the 
version of Buchberger’s algorithm using the Mora normal form algo- 
rithm in place of remainder on ordinary polynomial division, followed 
by dehomogenization, computes a polynomial standard basis for the ideal 
generated by S, and terminates after finitely many steps. 

Proof. For notational simplicity, we will write Gi = For part a, we 

will write Mora remainder H computed by Theorem (3.10) on 

division of the homogeneous polynomial F by 5^. If 5 is a standard basis 
of /, then since S(gi,gj) G I for all i,j, the result of Exercise 4 implies 

— ; — r rrS^ jMora 

that S{g^, g^) = 0 for all i, j. 

r j-S^ ,Mora 

Conversely, we need to show that S{g^,g^) = 0 for all i,j 

implies that 5 is a standard basis, or equivalently that (lt(/)) = 
(lt(^i), . . . , LT{gt)), using the order >. As in the proof of the usual version 
Buchberger’s criterion (see, for example. Theorem 6 of Chapter 2, §6 of 
[CLO]) we must consider expressions for elements in (5^) in which cancel- 
lation of leading terms may have occurred, producing a leading term that is 
not a multiple of any of the ur{Gi). (The leading terms of all homogeneous 
polynomials are computing using the >' order from Definition (3.6).) Say 

(4.3) F = Y^AiGi 

i 

is such a combination, in which \sr{AiGi) > lt(F) for some i. Since the Gi 
are homogeneous, we can without loss of generality restrict to considering 
equations (4.3) where the Ai and F are also homogeneous and the left and 
right-hand sides have the same total degree. Then the proof of Theorem 6 
of Chapter 2, §6 of [CLO] applies in this case too. If ur{Ai)LT{Gi) > lt(F) 
for some i, then we can reduce the leading term on the right hand side in 
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(4.3) by substituting from the expressions for the 5-polynomials produced 
by Mora’s algorithm. But there are only a finite number of monomials 
of each fixed total degree, so the reduction process eventually terminates 
with an expression (4.3) where lt(F) is a multiple of one of the LT{Gi). 
Dehomogenizing yields the desired result. 

The usual proof that Buchberger’s algorithm terminates and yields a 
Grobner basis depends only on the ascending chain condition for poly- 
nomial ideals (applied to the chain of monomial ideals generated by 
the leading terms of the “partial bases” constructed as the algorithm 
proceeds — see the proof of Theorem 2 of [CLO], Chapter 2, §2). It does 
not require that the order used for the division process be a well-order. It 
follows that, replacing each ordinary remainder computation by a compu- 
tation of the remainder from Mora’s algorithm, we get an algorithm that 
terminates after a finite number of steps. Moreover, on termination, deho- 
mogenizing the results gives a standard basis for I because the criterion 
from part a is satisfied. □ 

The Mora normal form algorithm and standard basis algorithms using lo- 
cal orders or more general semigroup orders > are not implemented directly 
in the Grobner basis packages in Maple or Mathematica. They could be 
programmed directly in those systems, however, using the homogenization 
process and the order >' from Definition (3.6). Alternatively, according to 
hazard’s original idea, the standard Buchberger algorithm could be applied 
to the homogenizations of a generating set for I. This approach is sketched 
in Exercise 7 below and can be carried out in any Grobner basis implemen- 
tation. Experience seems to indicate that standard basis computation with 
Mora’s normal form algorithm is more efficient than computation using 
hazard’s approach, however. The CAhl package for REDUCE does con- 
tain an implementation of Buchberger’s algorithm using semigroup orders 
including local orders. There is also a powerful package called Singular 
available via the World Wide Web from the University of Kaiserslautern 
(see the Singular homepage at the URh http://www.mathematik.uni- 
kl . de/zca/Singular/) that carries out these and many other calculations. 
In particular. Singular is set up so that local orders, monomial orders 
(well-orderings), and mixed orders can be specified in a unified way as >u 
orders for integer matrices U. This means that it can be used for both 
Grobner and standard basis computations. Here is a very simple Singular 
session computing a standard basis of the ideal generated by 

- xy^ -h xy -\-y^ x^ + y^ - z^ 

in R = fc[r, y, z](^x^y^z) using the alex order, and computing the multiplicity 
of the origin as a solution of the corresponding system of equations. 

> ring r = 32003 , (x,y,z),Ds; 

> ideal i = x5-xy6+z7 , xy+y3+z3 , x2+y2-z2 ; 
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> ideal j=std(i) ; 

4(2)s5.8-s(2)s9. .s(3).10. sH(ll) 

product criterion: 8 chain criterion:/ 

> j: 

j[l]=x2+y2-lz2 

j [2] =xy+y3+z3 

j [3] =y3-lyz2“lxy3-lxz3 

j [4] =xz4-ly6+2y4z2-ly3z3+2yz5-lxy6+z7 

j [5] =y2z4-lz6+xy6-2xy4z2+xy3z3“2xyz5+x2y6-lxz7 

j [6] =yz7 

j [7]=z9 

> vdim(j) ; 

24 

Singular can work either with a finite field of coefficients or with k = Q 
or a finite extension of Q. The first line here defines the characteristic of the 
field, the ring variables, and the monomial order. The Ds is an abbreviation 
for the alex order, which could also be specified by a matrix as follows 

> ring r = 32003 , (x,y,z), , (0,-1, -1) , (0,0,-!)) ; 

as in Exercise 2 of §3. The ideal I is defined by the three polynomials above, 
J contains the standard basis (7 polynomials in all), and the vdim command 
computes the dimension of dim R/ (lt( J)). For more information about this 
very flexible package, we refer the interested reader to the documentation 
available with the program. 

We will consider some applications of these methods in the remainder 
of this section. The multiplicity, and Milnor and Tjurina number compu- 
tations we introduced in §2 can be carried out in an algorithmic fashion 
using standard bases. The reason is the following fact. 

(4.4) Proposition. Let I be an ideal in a local ring R such that for some 
local order >, dim R/ {lt{I)) is finite. Then every f £ R can be written 
uniquely in the form 



f = 9 + r 

where g € I and r is a polynomial such that no term of r lies in (lt(7)). 
Furthermorej r is unique^ and can be computed algorithmically when R = 

k[Xlj . . . , • 

Proof. First suppose that R = fc[xi, . . . , ^n](xi,...,xn>- Then Exercise 11 
of §1 implies that I is generated by polynomials, which means that we can 
compute a polynomial standard basis G of I. If we divide / by G using 
Corollary (3.12), we get 



/ = ^1 + fti 
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where g \ G /, lt(/ii) ^ (lt(G)) = (lt(/)) (since G is a standard basis) 
and lt(/) > ur{hi). 

We can write hi = + ri where > Lx(ri) and ci ^ 0. 

Applying the above process to ri gives 

n = 92 + h2 = 92 + + f2 

with Q2 G /, ^ (lt(/)) and > LT(r 2 ). If we combine this with 

the formula for /, we obtain 

f = 91+ h = 9i + + ri = (9i + 92) + + C2X°‘^^^ + r 2 , 

where 31+^2 € I, x°‘^^\ ^ (lt(/)) and > LT(r 2 ). 

We can continue this process also long as we have nonzero terms to work 
with. However, since there are only finitely many monomials not in (lt(/)) 
(because of our hypothesis that dim R/ {lt{I)) is finite), this process must 
eventually terminate, which shows that / has the desired form. We will 
leave it for the reader to describe an algorithm which carries out this process 
(see Exercise 8 below). 

When R = k{xi, . . . , Xn} or R = k[[xi, . . . , Xn]], if we assume that we 
can perform the Mora Normal Form Algorithm on inputs from R, then the 
above argument applies for any f E R, The details of how this works will 
be discussed in Exercise 4 below. 

Once we have / = ^ + r, where g and r have the desired properties, the 
uniqueness of r follows easily (the proof is similar to the case for Grobner 
bases — see Proposition 1 of Chapter 2, §6 of [CLO]). □ 



When R = k[[xi^ . . . , Xn]] or R = k{xi, . . . , Xn}, there are more pow- 
erful versions of Proposition (4.4) which don’t assume that dim R/ (lt(/)) 
is finite. In these situations, the remainder r is an infinite series, none of 
whose terms are in (lt(/)). See, for example, [Hir] or [MPT]. However, for 
R = k[xi, . . . , Xn]{xi,...,xn)i if i® possible to find ideals I C R where nice 
remainders don’t exist (see [AMR], Example 2). 

A nice application of Proposition (4.4) is the following corollary which 
tells us how to compute the dimension of a quotient ring. 

(4.5) Corollary. Let I be an ideal in a local ring R, and assume that 
dim R/ {lt{I)) is finite for some local order >. Then we have 

dimR/I = dimi?/(LT(/)}. 

Proof. Given / G R, we write / = -I- r as in Proposition (4.4). It 

is easy to check that the map sending the coset [/] G R/I to the coset 
[r] G R/{lt{I)) is well-defined, linear and one-to-one. It is also onto since 
the cosets of the monomials not in (lt(/)) form a basis of R/{lt{I)). See 
Exercise 9 for the details. □ 
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We can use Corollary (4.5) to prove Proposition (2.11), which asserts 
that 

dim k[xij • • . , Xn\(^xi,...,Xn) / • • • ? ^n\{xi,...,Xn) 

= dim k[[xi , . . . , Xn]]/Ik[[xi , . . . , Xn]] 

= dim k{xi , . . . , Xn}/Ik{xi , . . . , Xn} 

where the last equality assumes k = M or C. This is because a standard ba- 
sis for I C k[xi^ . . . , Xn]{xi,...,xn) ^ standard basis for Ik[[xi^ . . . , x^]] 

and Ik{xi , . . . , a^n} by Buchberger’s criterion. Thus, for a fixed local order, 
the leading terms are the same no matter which of the local rings R we are 
considering. It follows that the quotient R/{lt{I)) don’t depend on which 
R we are using, and then the equalities above follow immediately from 
Corollary (4.5). Corollary (4.5) also gives a good method for computing 
multiplicities and Milnor and Tjurina numbers as defined in §2. 

Standard bases in local rings have other geometric applications as well. 
For instance, suppose that V G is a variety and that p = (ai, . . . , a^) 
is a point of V. Then the tangent cone to V at p, denoted Cp(F), is defined 
to be the variety 



Cv{V) = V{fp,min : / G 1(F)), 

where fp^min is the homogeneous component of lowest degree in the poly- 
nomial /(xi + ai, . . . , Xn 4- an) obtained by translating p to the origin (see 
part b of Exercise 17 of §2). A careful discussion of tangent cones, including 
a Grobner basis method for computing them, can be found in Chapter 9, 
§7 of [CLO]. However, standard bases give a more direct way to compute 
tangent cones than the Grobner basis method. See Exercise 13 below for 
an outline of the main ideas. 

Here is another sort of application, where localization is used to con- 
centrate attention on one irreducible component of a reducible variety. To 
illustrate the idea, we will use an example from Chapter 6, §4 of [CLO]. In 
that section, we showed that the hypotheses and the conclusions of a large 
class of theorems in Euclidean plane geometry can be expressed as polyno- 
mial equations on the coordinates of points specified in the construction of 
the geometric figures involved in their statements. For instance, consider 
the theorem which states that the diagonals of a parallelogram ABCD in 
the plane intersect at a point that bisects both diagonals (Example 1 of 
[CLO], Chapter 6, §4). We place the vertices A, J5, C, D of the parallelogram 
as follows: 



A = (0,0), B = (u, 0), C = {v,w), D = (a, 6), 

and write the intersection point of the diagonals AD and BC as TV = (c, d). 
We think of the coordinates u, v, ly as arbitrary; their values determine the 
values of a, 6, c, d. The conditions that ABCD is a parallelogram and N is 
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the intersection of the diagonals can be written as the following polynomial 
equations: 



hi = b — w = 0 

h 2 = {a — u)w — bv = 0 

hs = ad — cw = 0 

/i 4 = d{v — u) — {c — u)w, 

as can the conclusions of the theorem (the equalities between the lengths 
AN = DN and BN = CN) 



gi = — 2ac — 2cd 4- 6^ = 0 

Q 2 = 2cu — 2cv — 2dw — = 0. 

Since the geometric theorem is true, we might naively expect that the 
conclusions gi = Q 2 = ^ are satisfied whenever the hypothesis equations 
hi = h 2 = hs = = 0 are satisfied. If we work over the algebraically 

closed field C, then the Strong Nullstellensatz shows that our naive hope 
is equivalent to 



gi G I(V(/ii, /i2, ^4)) = V {hi, /i2, hs, /14). 

However, as the following exercise illustrates, this is unfortunately not true. 

Exercise 1. Use the radical membership test from [CLO], Chapter 4, §2 
to show that 



91,92 c C[u,v,w,a,b,c,d\. 

Thus neither conclusion ^ 1 , 5^2 follows directly from the hypothesis equa- 
tions. 

In fact, in [CLO], Chapter 6, §4 we saw that the reason for this was 
that the variety V(/ii, /12, ^3? ^4) C defined by the hypotheses is ac- 
tually reducible^ and the conclusion equations gi = ^ are not identically 
satisfied on several of the irreducible components of H. The points on the 
“bad” components correspond to degenerate special cases of the configu- 
ration Aj H, C, D, N such as “parallelograms” in which two of the vertices 
A^B^C^D coincide. In [CLO], Chapter 6, §4 we analyzed this situation very 
carefully and found the “good” component of H, on which the conclusions 
5^1 = ^2 = 0 do hold. Our purpose here is to point out that what we did 
in [CLO] can also be accomplished more easily by localizing appropriately. 

Note that taking (u,v,w) = (1,1,1) gives an “honest” parallelogram. 
If we now translate (1, 1, 1) to the origin as in Exercise 17 of §2, and 
write the translated coordinates as (C7, V, W, a, 6, c, d), the hypotheses and 
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conclusions become 

hi = b-W-l = 0 

/i2 = (a - [/ - 1){W -\-l)-b{V + l)=0 
= ad — c{W + 1) = 0 
h^ = d{V-U)-{c-U- 1){W + 1) 

Qi = — 2ac — 2cd + 6^ = 0 

92 = 2c{U + 1) - 2c(V + 1) - 2d{W + 1)-(U+ if 
+ {V + If + {W + If = 0. 

We compute a standard basis for the ideal generated by the hi in the 
localization R = Q[i7, V, 6, c, d]. Here is a Singular session 

carrying out the computations. 

>ringr = 0, (a,b,c,d,U,V,W) , (Dp(4) ,Ds(3)) ; 

> ideal i = b“W-l, (a~U-l)*(W+l)-b*(V+l) , ad-c*(W+l), d*(V-U)- 
(c-U-l)*(W+l); 

> ideal j = std(i) ; 

> j; 

j [l]=a+aW-lb-lbV-l-lU-lW-lUW 
j [2]=b-l-lW 

j [3]=c+cW+dU-ldV-l-lU-lW-lUW 
j [4] =2d+2dU+2dW+2dUW-l-lU-2W-2UW-lW2-lUW2 

The first line sets up the ring R by specifying the coefficient field k = Q 
and a mixed order on the variables as in Exercise 3 of §3 of this chapter, with 
alex on the variables U, V, W ordinary lex on a, ft, c, d, and all monomials 
containing a, 6, c, d greater than any monomial in [/, V, W alone. If we now 
apply the Mora algorithm, which is provided in the Singular command 
reduce, we find that both conclusions are actually in the ideal generated 
by hi^h 2 ,hs, /14 in R. (The reduce command takes inhomogeneous inputs 
and returns the dehomogenization of the remainder from Theorem (3.10).) 

> poly g=a2-2ac“2bd+b2 ; 

> poly h=reduce (g , j ) ; 

>h; 

0 

> poly m = 2c* (U+1) -2c* (V+1) -2d* (W+1) - (U+1) ''2+(V+D ^2+(W+l) ^2 ; 

> poly n = reduce (m,j) ; 

> n; 

0 

This shows that locally near the point with (ix, u, w) — (1, 1, 1) on the 
variety V(/ii, ^ 3? ^4)? the conclusions do follow from the hypotheses. 
Using the mixed order in the Mora algorithm, we have an equation 

u • Qi = aihi a/ih/i 
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where u G Q[J7, V, W\ is a unit in Q[f/, V, W](^uy,w) > 3<nd a similar equation 
for Q 2 . In particular, this shows that Proposition 8 of Chapter 6, §4 of [CLO] 
applies and the conclusions g\^g 2 follow generically from the hypotheses 
hi, as defined there. 

Along the same lines we have the following general statement, showing 
that localizing at a point p in a variety V imples that we ignore components 
of V that do not contain p. 

(4.6) Proposition. Let I C k[xi, . . . , Xn] ond suppose that the origin in 
is contained in an irreducible component W of V (I). Let /i, . . . , /^ G 
k[xi, . . . , Xn] be a standard basis for I with respect to a local order, and let 
g G k[xi, . . . , Xn]- If the remainder of g on division by F = (/i, . . . , /s) 
using the Mora algorithm from Theorem (3.10) is zero, then g G I(W^) (but 
not necessarily in I). 

Proof. If the remainder is zero, after dehomogenizing, the Mora algo- 
rithm yields an equation 



U' g = aifi + • • • -f asfs 

where u G k[xi, . . . ,Xn] is a unit in k[xi, . . . ,Xn]{xi,...,xn)- Since W C 
V(7), u • p is an element of I(W^). But W is irreducible, so l(W) is a prime 
ideal, and hence u G 1(1^) or g e I(W^). The first alternative is not possible 
since u(0) ^ 0. Hence g G I{W). □ 

It is natural to ask if we can carry out operations on ideals in local rings 
algorithmically in ways similar to the Grobner basis methods reviewed 
in Chapter 1 for ideals in polynomial rings. In the final part of this sec- 
tion, we will show that the answer is yes when R = k[xi , . . . , 

Since many of the proofs in the polynomial case use elimination, we first 
need to study elimination in the local context. The essential point will 
be to work the new ring k[x \, . . . , whose elements can be 

thought of first as polynomials in t whose coefficients are elements of 

k[xi, ... 5 ^n](xi,...,a:n) • 

In this situation, if we have an ideal I C k[x \, . . . , a^n](xi,...,xn) W? 
basic problem is to find the intersection 

Iq — I ^ ^[^1) • • • > ^n](xi,...,Xn) • 

Note that Iq is analogous to an elimination ideal of a polynomial ideal. This 
elimination problem can be solved using a local order > on the local ring 
to construct a suitable semigroup order on 5 = k[xi , . . . , a^n]{xi,...,xn) W ^ 
follows (see [AMR] and [Gra] for the details). 

(4.7) Definition. An elimination order on S is any semigroup order >eiim 
on the monomials on S defined in the following way. Let > be a local order 
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in k[xi, Xn](xu-,x„)- Then define 

t'^X°‘ >elim 

for k,l G Z>o, and a, P G Z>q if and only if A: > /, or A; = Z and 
a > /3. In other words, an elimination order is a product order combin- 
ing the degree order on powers of t and the given local order > on 

k[X\, . . . , ^n](a:i ,...,Xn) • 

Elimination orders on S are neither local nor well-orders. Hence, the 
full strength of the Mora algorithm for general semigroup orders is needed 
here. We have the following analog of the Elimination Theorem stated in 
Chapter 2 , § 1 . 

(4.8) Theorem (Local Elimination). Fix an elimination order >eiim 
onS = k[xi , . . . , iCn](xi,...,x„) W- Let I C S be an ideal, and let G be a poly- 
nomial standard basis for I with respect to >eiim- Then G fl k[x \, . . . , Xn] 
is a standard basis for Iq = I C\ k[xi, , Xn]- In fact, 

{g G G : lt(^) does not contain t} 

is a standard basis of Iq- 

Proof. Let G = { 5 ^ 1 , . . . , be a standard basis of I and Go = {g E G : 
LT{g) does not contain t}. By the definition of >eiim^ the condition that 
ur{g) does not contain t implies that g does not contain t. Since Go C Iq, 
we need only show that if / G /q H k[xi , . . . , Xn] then / can be written as 
a combination of elements in Go with coefficients in k[xi , . . . , a^n](xi,...,xn)* 
Exercise 4 below shows that since f G I and {gi, • • • , gt} is a standard 
basis of 7, the Mora algorithm gives (after dehomogenizing) 

/ = o,i9i + • * * + CLtOt’f 

where lt(/) > ur{aigi) for all ai ^ 0. By our choice of order, we have 
Ui = 0 for gi ^ Go and and gi G k[xi , . . . , a^n](xi,...,xn) otherwise, since t 
does not appear in lt(/). □ 

With this out of the way, we can immediately prove the following: 

(4.9) Theorem. 7, J C A:[xi, . . . , a:n](xi,...,xn> and f G k[xi, . . . ,Xn]- 

a. InJ = {t-I + {l-t)-J)n fc[xi, . . .,Xn]{xu-,x„)- 

b. I:{f) = j -{in if)). 

C. I:f°° = (l+{l-f-t))nk[xi,...,Xn](xu-.,Xn)- 

A. f e Vi if and only i / 1 e / + (1 - / • i) in , a;n]<xi,...,x„) W- 

Proof. The proofs are exactly the same as in the case of polynomial 
ideals. (See Chapter 1 of this book, §2 and §3 of Chapter 4 of [CLO], and 
[AL] or [BW].) 
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We remind the reader that if I is an ideal in any ring i?, and f e 
then the stable quotient of I with respect to /, denoted is defined 

to be the set 

I‘f^ = {g^R: there exists n > 1 for which f^g G /}. 

The stable quotient is frequently useful in applications of local algebra. We 
also remark that the division in part b, where one divides the common 
factor / out from all generators of 7 fl (/) in fc[xi, . . . ,Xn]{xi,...,xn)^ 
the Local Division Algorithm. □ 

Just as the ability to do computations in polynomial rings extends to 
allow one to do computations in quotients (i.e., homomorphic images of 
polynomial rings), so, too, the ability to do computations in local rings 
extends to allow one to do computations in quotients of local rings. Sup- 
pose that J C k[xi , . . . , Xn]{xi,...,xr,) and let R = . . . , Xn]{xi,...,xr,)l J • 

The one can do computations algorithmically in R due to the following 
elementary proposition. 

(4.10) Proposition. Let I 2 C R be ideals, and let 7i,/2 denote their 
preimages in A^[xi, . . . 5 3^n](rri,..,,xn)* f ^ • • • 5 ^n](xi,...,xn) and 

[/] £ R be its coset Then: 

a. A n I 2 = ((7i + J) n ih + J))/J. 

b. A:[/] = ((7i+J):/)/J. 

c. 7i:[/r = ((7i + J):/-)/J. 

Using a standard basis of J allows one to determine whether f,g E J 
represent the same element in R (that is, whether [/] = [g].) One can also 
compute Hilbert functions and syzygies over R. 

The techniques we have outlined above also extend to rings which 
are finite algebraic extensions of k[xi , . . . , Xn]{xi,...,xn) • • • ? ^n]]- 

This allows us to handle computations involving algebraic power series in 
A;[[a:i, . . . , Xn]] algorithmically. See [AMR] for details. There are still many 
open questions in this area, however. Basically, one would hope to handle 
any operations on ideals whose generators are defined in some suitably al- 
gebraic fashion (not just ideals generated by polynomials), but there are 
many instances where no algorithms are known. 



Additional Exercises for §4 

Exercise 3. In this exercise and the next, we will show that every ideal 
I in one of our local rings R has standard bases, and derive consequences 
about the structure of R. Let > be any local order on R. 
a. Explain why (lt(7)) has a finite set of generators. 
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b. For each i = 1, . . . , in a finite set of generators of (lt(/)), let 
Qi e I hQ an element with hT{gi) = Deduce that G = {gi^ . . . , gt} 
is a standard basis for I. 

Exercise 4. If we ignore the fact that infinitely many computational steps 
are needed to perform reductions on power series in k[[xi^ . . . ^Xn]] or 
k{xi , . . . , Xn}’> then the steps of the Mora normal form algorithm can also 
be performed with inputs that are not polynomials. Hence we can assume 
that the Mora algorithm works for where R is either k[[xi ^ . . . , Xn]] or 
k{xi , . . . 

a. Let G be a standard basis for an ideal I C R. Show that we obtain a 
zero remainder on division of / by G if and only f E I. 

b. Using part a, deduce that every ideal I C R has a finite basis. (This is 
the analog of the Hilbert Basis Theorem for k[xi , . . . , Xn]-) 

c. Deduce that the ascending chain condition holds for ideals in R. 

Exercise 5. Let > be any local monomial order on one of our local rings 
R. Show that set of monomials x^ greater than a fixed monomial x^ has a 
smallest element. 

Exercise 6. Carry out the proof of the analog of Buchberger’s Crite- 
rion for local local orders, using Exercise 5 and the discussion before the 
statement of Theorem (4.2). 

Exercise 7. This exercise discusses an alternative method due to Lazard 
for computing in local rings. Let >' be the order in k[xi , . . . , t] from 
Definition (3.6). Given polynomials /i, . . . , let fi, • • • fs be their 
homogenizations in A:[xi, . . . , t], and let G be a Grobner basis for 

{fi^ • • • > fs) with respect to the >' consisting of homogeneous polynomi- 
als (such Grobner bases always exist — see Theorem 2 in Chapter 8, §3 of 
[CLO], for instance). Show that the dehomogenizations of the elements of 
G (that is, the polynomials in fc[xi, . . . , Xn] obtained from the elements of 
G by setting t = 1) are a standard basis for the ideal generated by F in 
the local ring R with respect to the alex order. 

Exercise 8. Let R = fc[xi, . . . , Xn](a;i,...,a;„>> and let 7 C i? be an ideal 
such that dimi?/(LT(/)) is finite for some local order on R. Describe an 
algorithm which for the input f E R computes the remainder r from 
Proposition (4.4). 

Exercise 9. Complete the proof of Corollary (4.5) by showing that the 
map [/] i-> [r] (where f = g -\-r is the decomposition of Proposition (4.4)) 
is well-defined, linear, and one-to-one. 
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Exercise 10. 

a. Let ^ , Xn] be homogeneous polynomials of de- 
grees respectively. Assume that I = (/i,*..,/n) is 

zero-dimensional, and that the origin is the only point in V(7). Show 
that the multiplicity is also the dimension of 

and then prove that the multiplicity of 0 as a solution of /i = • • • = 
fn = 0 is di • • • dn- Hint: Regard /o, • • • , /n as homogeneous polynomi- 
als in xo, xi, . . . , Xji’, where xq is a new variable. Using xq^ Xi, . . . , 
as homogeneous coordinates for P^, show that /i = . . . = /n = 0 have 

no nontrivial solutions when xq = 0, so that there are no solutions at 

00 in the sense of Chapter 3. Then use Bezout’s Theorem as stated in 
Chapter 3. 

b. Let /(a;i, . . . , Xn) be a homogeneous polynomial of degree d with an 
isolated singularity at the origin. Show that the Milnor number of / at 
the origin is (d — 1)’^. 

Exercise 11. Determine the multiplicity of the solution at the origin for 
each of the following systems of polynomial equations. 

a. x^ + 2xy"^ — y‘^ = xy — = 0. 

b. + 2t/^ — y — 2z = x^ — Sy‘^ + IO2: = x^ — 7yz = 0. 

c. + 2/^ + x^ — yz — X = X — y + 2z = 0. 

Exercise 12. Compute the Milnor and Tjurina numbers at the origin of 
the following polynomials (all of which have an isolated singularity at the 
origin). 

a. /(x, y) = (x^ + y^)^ — 4x^y^. The curves V(/) C is the four-leaved 
rose — see Exercise 11 of [CLO], Chapter 3, §5. 

b. /(x, 2/) = 2/^ — x^, n > 2. Express the Milnor number as a function of 

n. 

c. /(x, 2/, z) = xyz + x^ + 2/^ -f 

Exercise 13. (Tangent Cones) For each / G (xi, . . . , Xn), let fmin be the 
homogeneous component of lowest degree in /. Let V = V(/i, . . . , /«) C 
be a variety containing the origin. 

a. Let G = {^1 , . . . , ^t} be a standard basis for 

^ ~ (/l) • * • J fs)^[^lj • • • J ^n](xi,...,Xn) 

with respect to a local order >. Explain why lt :^{ gi) is one of the terms 
1^ 9i,TniTi Ibr each i. 

b. Show that V{gi,min, • • • ? 9t,min) Is the tangent cone of V at the origin. 

c. Consider the variety V = V(x^ — yz — x^y‘^ -\- 2z^) in k^. Using the 
>aiex order on fc[x, 2/, z](^x,y,z)’> with x > y > z^ show that the two given 
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polynomials in the definition of V are a standard basis for the ideal they 
generate, and compute the tangent cone of V at the origin using part b. 

Exercise 14. For a r-dimensional linear subspace L C a polynomial 
/ G C[xi , . . . , Xn] restricts to a polynomial function /l on L, 

a. Show that if / has an isolated singularity at the origin in then for al- 
most all r-dimensional subspaces L C C^, /l has an isolated singularity 
at the origin in L. 

b. One can show, in fact, that there is an open dense set M of all r- 
dimensional subspaces of such that the Milnor number /x(/l) of 

at the origin does not depend on the choice of L in M. This number 
is denoted Show that = mult(/) — 1 where mult(/) (the 

multiplicity of /) is the degree of the lowest degree term of / that occurs 
with nonzero coefficient. 

c. Compute /i^(/) and if 

1. / = + i/4 ^7. 

2. / = x4 4- i/^ + + xyz] 

3. f = x^ + xy^ + y^z + 

4. f = y'^z + z^^. 

Note that if n is the number of variables, then = A^(/)j so that 

/i^(/) is just the usual Milnor number for these examples. To com- 
pute these numbers, use the milnor package in Singular and note that 
planes of the form z = ax by are an open set in the set of all planes 
in C^. One could also compute these Milnor numbers by hand. Note 
that examples 1, 3, and 4 are weighted homogeneous polynomials. For 
further background, the reader may wish to consult [Dim] or [AGVj. 

d. A family {ft G C[o:i, . . . , Xn]} of polynomials with an isolated singular- 
ity at the origin for all t near 0 is said to be fi-constant if /i(/o) = 

for all t near 0. Show that the families ft = x^ y"^ z'^ -i- tx^y‘^ and / = 
x^ + txy^ + y'^z -h are //-constant, but that ft = x"^ y^ z^ txyz 
is not. 

e. If / G C[xi, . . . , Xn] has an isolated singularity at the origin, the n-tuple 

of integers (/i^(/), . . . , //’^(/)) is called the Teissier fi* -invariant of /. 
One says that a family {ft} is ii*-constant if //*(/o) = Show 

that /t = -b txy^ + y'^ z -f z^^ is //-constant, but not //* constant. 
This is a famous example due to Briangon and Speder — there are very 
few known examples of //-constant families which are not //* -constant. 
At the time of writing, it is not known whether there exist //-constant 
families in which y} is not constant. The attempt to find such examples 
was one of the issues that motivated the development of early versions 
of Singular. 




Chapter 5 

Modules 



Modules are to rings what vector spaces are to fields: elements of a given 
module over a ring can be added to one another and multiplied by elements 
of the ring. Modules arise in algebraic geometry and its applications be- 
cause a geometric structure on a variety often corresponds algebraically to 
a module or an element of a module over the coordinate ring of the variety. 
Examples of geometric structures on a variety that correspond to modules 
in this way include subvarieties, various sets of functions, and vector fields 
and differential forms on a variety. In this chapter, we will introduce mod- 
ules over polynomial rings (and other related rings) and explore some of 
their algebra, including a generalization of the theory of Grobner bases for 
ideals. 



§1 Modules over Rings 

Formally, if is a commutative ring with identity, an i?-module is defined 
as follows. 

(1.1) Definition. A module over a ring R (or i?-module) is a set M 
together with a binary operation, usually written as addition, and an op- 
eration of R on M, called (scalar) multiplication, satisfying the following 
properties. 

a. M is an abelian group under addition. That is, addition in M is associa- 
tive and commutative, there is an additive identity element 0 G M, and 
each element f e M has an additive inverse — / satisfying / + (— /) = 0. 

b. For all a G i? and all /, ^ G M, a{f + g) = af ag. 

c. For all a,b G R and all / G M, (a + b)f = af bf. 

d. For all a^b G R and all / G M, {ab)f = a{bf). 

e. If 1 is the multiplicative identity in R, If = f for all f e M. 

The properties in the definition of a module may be summarized by 
saying that the scalar multiplication by elements of R interacts as nicely 
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as possible with the addition operation in M. The simplest modules are 
those consisting of all m x 1 columns of elements of R with componentwise 
addition and scalar multiplication: 



f ai^ 




(h\ 




( o\ -\- b\ ^ 




f ai^ 




( cai ^ 


02 




h 




Ci2 + ^2 








C02 
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\ bfn j 




\ bfjri J 




\ 0>m / 




\ CUm J 



for any ai, . . . , a^, 6i, . . . , c £ R. We call any such column a vector 
and the set of all such R^. 

One obtains other examples of ii-modules by considering submodules of 
that is, subsets of R^ which are closed under addition and scalar 
multiplication by elements of R and which are, therefore, modules in their 
own right. 

We might, for example, choose a finite set of vectors fi, . . . , and con- 
sider the set of all column vectors which are can be written as an i?-linear 
combination of these vectors: 

•{uifi -f- . . . G RJ^ where ai, • • • , a^ G R}. 

We denote this set (fi, . . . fs) and leave it to you to show that this is an 
jR-module. 

Alternatively, consider an / x m matrix A with entries in the ring R. If 
we define matrix multplication in the usual way, then for any f G R^, the 
product Af is a vector in RK We claim (and leave it to you to show) that 
the set 



ker A = {f G : Af = 0} 

where 0 denotes the vector in R^ all of whose entries are 0 is an i?-module. 

Exercise 1. Let R be any ring, and R^ the m x 1 column vectors with 
entries in R. 

a. Show that the set (fi, . . . , f^) of i?-linear combinations of any finite set 
fi, . . . , fs of elements of is a submodule of R^. 

b. If A is an / X m matrix with entries in J?, show that ker A is a submodule 
ofR^. 

c. Let A be as above. Show that the set 

im A = {g G : g = Af, for some f G R^} 

is a submodule of RK In fact, show that it is the submodule consisting 
of all i?-linear combinations of the columns of A considered as elements 
of RK 

d. Compare parts a and c, and conclude that (fi, . . . , f^) = imF where F 
is the m X s matrix whose columns are the vectors fi, . . . , fg. 
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The modules are close analogues of vector spaces. In fact, if ii = fc is 
a field, then the properties in Definition (1.1) are exactly the same as those 
defining a vector space over fc, and it is a basic fact of linear algebra that 
the vector spaces exhaust the collection of (finite-dimensional) fc- vector 
spaces. (More precisely, any finite dimensional fc-vector space is isomorphic 
to for some m.) However, submodules of when i? is a polynomial 
ring can exhibit behavior very diferent from vector spaces, as the following 
exercise shows. 

Exercise 2. Let R = fc[a:, y, z\. 

a. Let M C he the module (fi, £ 2 , fs) where 




Show that M = ker A where A is the 1x3 matrix {x y z). 

b. Show that the set {fi, f 2 , fa} is minimal in the sense that M 7 ^ (fi, f^), 
1 <i <j <3. 

c. Show that the set fi, f 2 , fa} is i2-linearly dependent. That is, show that 
there exist nonzero ai, a 2 , as e R = k[x, y, z] such that aifi + U 2 f 2 + 
asfs = 0 , where 0 is the zero vector in R^. 

d. Note that the preceding two parts give an example of a submodule of 
k[x,y,z]^ in which there is a minimal generating set which is not linearly 
independent. This phenomenon cannot occur in any vector space. 

e. In fact more is true. Show that there is no linearly independent set 
of vectors which generate the module M. Hint: First show that any 
linearly independent set could have at most two elements. A fairly brutal 
computation will then give the result. 

On the other hand, some of the familiar properties of vector spaces carry 
over to the module setting. 

Exercise 3. Let M be a module over a ring R. 

a. Show that additive identity 0 G M is unique. 

b. Show that each f e M has a unique additive inverse. 

c. Show that 0/ = 0 where 0 G ii on the left hand side is the zero element 
of R and 0 G M on the right hand side is the identity element in M. 

Before moving on, we remark that up to this point in this book, we 
have used the letters /, h most often for single polynomials (or elements 
of the ring R = fc[xi, . . . , Xn])- In discussing modules, however, it will be 
convenient to reserve the letters e, /, h to mean elements of modules over 
some ring i?, most often in fact over R = fc[xi, . . . ^Xn]> In addition, we 
will use boldface letters e, f , g, h for column vectors (that is, elements of 
the module R^). This is not logically necessary, and may strike you as 
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slightly schizophrenic, but we feel that it makes the text easier to read. 
For single ring elements, we will use letters such as a, 6, c. Occasionally, for 
typographical reasons, we will need to write vectors as rows. In these cases, 
we use the notation (ai, . . . , to indicate column vector which is the 
transpose of the row vector ( ai ... )• 

Many of the algebraic structures studied in Chapters 1 through 4 of this 
text may also be incorporated into the general context of modules as the 
exercise below shows. Part of what makes the concept of a module so usefl 
is that it simultaneously generalizes the notion of ideal and quotient ring. 

Exercise 4. 

a. Show that an ideal / C ii is an i?-module, using the sum and product 
operations from R. 

b. Conversely, show that if a subset M C is a module over i?, then M 
is an ideal in R. 

c. Let I be an ideal in R. Show that the quotient ring M = i?/7 is an 
i?-module under the quotient ring sum operation, and the scalar mul- 
tiplication defined for cosets [g] G R/I, and / G i? by f[g] = [fg] G 
Rjl. 

d. (For readers of Chapter 4) Show that the localization M = Rp of R at 
a prime ideal P C 7? is a module over 7?, where the sum is the ring sum 
operation from Rp, and the scalar product of b/c G Rp by a G 7? is 
defined as a • b/c = ab/c G Rp. 

e. Let M, N be two 7?-modules. The direct sum M 0 iV is the set of all 
ordered pairs (/, g) with / G M, and g G N. Show that M 0 AT is 
an 7?-module under the component-wise sum and scalar multiplication 
operations. Show that we can think of R^ as the direct sum 

R^ = ReRe...eR 

of R with itself m times. 

We have already encountered examples of submodules of R^. More gen- 
erally, a subset of any 7i-module M which is itself an 7i-module (that is, 
which is closed under addition and multiplication by elements of R) is called 
a submodule of M. These are the analogues of vector subspaces of a vector 
space. 

Exercise 5. Let F C M be a subset and let AT c M be the collection of 
all / G M which can be written in the form 

/ = O^lfl + • • * + O.nfri’i 

with Oi G R and fiGF for all i. Show that A^ is a submodule of M. 

The submodule N constructed in this exercise is called the submodule 
of M generated by F. Since these submodules are natural generalizations 
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of the ideals generated by given subsets of the ring R, we will use the 
same notation — the submodule generated by a set F is denoted by (F). If 
(F) = M, we say that F spans (or generates) M. If there is a finite set 
that generates M, then we say that M is finitely generated. 

Exercise 6. 

a. Let iZ be a ring. Show that M — is finitely generated for all m. Hint: 
Think of the standard basis for the vector space and generalize. 

b. Show that M = k[x, y] is a module over R = k[x] using the ring sum 
operation from k[x, y] and the scalar multiplication given by polynomial 
multiplication of general elements of M by elements in R. However, show 
that M is not finitely generated as an iZ-module. 

If iV is a submodule of M, then the set of equivalence classes of elements 
of M where / G M is deemed equivalent to g e M if and only if f — g G N 
forms an jR-module with the operations induced from M (we ask you to 
check this below). It is called the quotient of M by iV and denoted by 
M/N. 

Exercise 7. As above, let M, N be iZ-modules with N C M, let [/] = 
{g e M : g — f G N} be the set of all elements of M equivalent to /, 
and denote the set of all sets of equivalent elements by M/N. These sets 
of equivalent elements are called equivalence classes or cosets. Note that 
we can write [/] = / -f N. Show that M/N is an JfZ-module if we define 
addition by [/] + [^] = [f g] and the scalar multiplication by a[f] = [af] 
hy a G R. Hint: You need show that these are well-defined. Also show that 
the zero element is the set [0] = N. 

The quotient module construction takes a little getting used to, but is 
extremely powerful. Several other constructions of modules and operations 
on modules are studied in the additional exercises. 

After defining any algebraic structure, one usually define maps that pre- 
serve that structure. Accordingly, we consider module homomorphisms, the 
analogues of linear mappings between vector spaces. 

(1.2) Definition. An R-module homomorphism between two iZ-modules 
M and N is an iZ-linear map between M and N. That is, a map (p : M N 
is an iZ-module homomorphism if for all a G i? and all /, ^ G M, we have 

¥>(«/ + g) = <»¥>(/) + ^{g)- 

This definition implies, of course, that (p{f + g) = <p(f) + <f{g) and 
ip(af) = a(p{f) for all a G i? and all /, ^ G M. 

When M and N are free modules, we can describe module homomor- 
phisms in the same way that we specify linear mappings between vector 




184 Chapter 5. Modules 



spaces. For example, letting M = N = every i?-module homomorphism 
(p : R R is given by multiplication by a fixed / G R — if g G R^ then 
(^(p) = fg. To see this, given (^, let / = </?(!)• Then for any a e R, 

p{a) = (p{a • 1) = a • p{\) = af = fa. 

Conversely, by the distributive law in jR, multiplication by any / defines 
an 7?-module homomorphism from R to itself. 

More generally cp is module homomorphism from R^ to R^ if and only 
if there exist I elements fi, . . . , f/ G R^ such that 

p{{ai, . . . , ai)'^) = aifi H h a^fn 

for all (ai, . . . , G RK Given (p, and letting ei, e 2 , . . . , be the 
standard basis vectors in R^ (that is is the column vector with a 1 
in the row and a 0 in all other rows), we can see this as follows. 
For each z, let fi = p{ei). Each (ai,...,a/)^ can be written uniquely 
as (tti, . . . , ai)^ = aiBi -f • • • -|- aiei. But then, since is a homomor- 
phism, knowing p{ej) = fj determines the value of p{{ai , . . . , ai)^) for all 
(ai, . . . , a/)^ G R^. Then expanding each fj in terms of the standard basis 
in R^^ we see that p may be represented as multiplication by a fixed m x I 
matrix A = (a^j) with coefficients in R. The entries in the jth column give 
the coefficients in the expansion of fj = p{^j) in terms of the standard 
basis in R^. We record the result of this discussion as follows (the second 
part of the proposition is a result of exerise 1). 

(1.3) Proposition. Given any R-module homomorhism p : R^ — > R^y 
there exists an I x m matrix with coefficients in R such that p{() = Af 
for all f G R ^ . Conversely y multiplication by any I x m matrix defines an 
R-module homomorphism from R^ to R } . 

The discussion above actually shows that an iZ-module homomorphism 
p : M N between two i^-modules is always determined once one knows 
the action of (/? on a set of generators of M. However, unlike the situation in 
which M == R^y one cannot define a homomorphism p on M by specifying 
p arbitrarily on a set of generators of M. The problem is that there may be 
relations among the generators, so that one must be careful to choose values 
of p on the generators so that p is well-defined. The following exercise 
should illustrate this point. 

Exercise 8. Let R = k[xyy]. 

a. Is there any /^-module homomorphism p from M = y^) C. Rio R 

satisfying p{pc^) = y and p{y^) = xl Why or why not? 

b. Describe all fc[x, z/]-homomorphisms of (x^, y^) into k[Xy y]. 

As in the case of vector spaces, one can develop a theory of how the same 
homomorphism can be represented by matrices with respect to different sets 
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of generators. We carry out some of this development in the exercises. We 
have already defined the kernel and image of matrix multiplication. The 
same definitions carry over to arbitrary homomorphisms. 

(1.4) Definition. If (^ : M — > iV is an i2-module homomorphism between 
two ii-modules M and iV, define the kernel of denoted ker((/?), to be the 
set 



ker{v?) = {/ G M : = 0}, 

and the image of denoted im(</?), to be the set 

im{(p) = {g e N : there exists f E M with (p{f) = g}. 

The homomorphism (p is said to be an isomorphism if it is both one-to-one 
and onto, and two i2-modules M, N are called isomorphic^ written M = N 
if there is some isomorphism cp : M N, 

The proofs of the following statements are the same as those of the 
corresponding statements for linear mappings between vector spaces, and 
they are left as exercises for the reader. 

(1.5) Proposition. Let (p : M N he an R-module homomorphism 
between two R-modules M and N. Then 

a. y?(0) = 0. 

b. ker((^) is a submodule of M. 

c. im((/?) is a submodule of N. 

d. (p is one-to-one (injective) if and only ifker{(p) = {0}. 

Proof. See Exercise 16. □ 

When we introduce the notions of linear combinations and linear inde- 
pendence and R is not a field (for example when R = k[xi, . . . yXn]), the 
theory of modules begins to develop a significantly different flavor from that 
of vector spaces. As in linear algebra, we say that a subset F = {/i, . . . , /^} 
of a module M is linearly independent over R (or i?-linearly independent) 
if the only linear combination ai/i -f • • • -j- a^/n with ai E R and fiEF 
which equals 0 G M is the trivial one in which ai = • • • = = 0. A set 

F EL M which is i?-linearly independent and which spans M is said to be 
a basis for M. 

Recall from linear algebra that every vector space over a field has a basis. 
In Exercise 2, we saw that not every module has a basis. An even simpler 
example is supplied by the ideal M = {x^,y^) C R studied in Exercise 8 
(which is the same as the R-module generated by x^ and y^ in R). The 
set {x^,y^} is not a basis for M as a module because x^ and are not 
R-linearly independent. For example, there is a linear dependence relation 
y^x^ — x‘^y^ = 0, but the coefficients y^ and —x^ are certainly not 0. On 
the other hand, because y^} spans M, it is a basis for M as an ideal. 
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More generally, any ideal M in R = k[xi, . . . ,Xn] which requires more 
than a single polynomial to generate it cannot be generated by an ii-linearly 
independent set. This is true because any pair of polynomials /i, /2 C R 
that might appear in a generating set (an ideal basis) satisfies a non-trivial 
linear dependence relation /2/1 — /1/2 = 0 with coefficients in R. Thus the 
meaning of the word “basis” depends heavily on the context, and we will 
strive to make it clear to the reader which meaning is intended by using the 
phrases “ideal basis” or “module basis” to distinguish between the alter- 
natives when there is a possibility of confusion. The following proposition 
gives a characterization of module bases. 

(1.6) Proposition. Let M be a module over a ring R. A set F C M is a 
module basis for M if and only if every f E M can be written in one and 
only one way as a linear combination 

f = (^ifl + • • • + ttn/nj 
where Oi G i?, and fi G F. 

Proof. The proof is the same as the corresponding statement for bases 
of vector spaces. □ 



The examples above show that, unlike vector spaces, modules need not 
have any generating set which is linearly independent. Those that do are 
given a special name. 

(1.7) Definition. Let M be a module over a ring R. M is said to be a free 
module if M has a module basis (that is, a generating set that is i?-linearly 
independent). 



For instance, the i?-module M = R^ is a free module. The standard 
basis elements 

0 



ei = 



/ 1 \ 

0 

0 



\V 



,C2 



, . . . , C771 



w 



0 

W 



form one basis for M as an ii-module. There are many others as well. See 
Exercise 19 below. 

We remark that just because a module has a single generator, it need 
not be free. As an example, let R be any polynomial ring and f E R 
a nonzero polynomial. Then M = R/{f) is generated by the set [1] of 
elements equivalent to 1. But [1] is not a basis because / • [1] = [/] = [0] = 
0 G M. 
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In the following exercise we will consider another very important class of 
modules whose construction parallels one construction of vector subspaces 
in fc”, and we will see a rather less trivial example of free modules. 

Exercise 9. Let ai, . . . , G i?, and consider the set M of all solutions 

(Xi, . . . , Xm)^ ^ of linear equation 

CL\X\ dffiXfYi — 0. 

a. Show that M is a submodule of (In fact, this follows from exercise 

1 because M = kerA where A is the row matrix A = {ai ... )•) 

b. Take R = fc[x, y], and consider the following special case of the linear 
equation above: 

Xi + x^X 2 + {y- 2)^3 = 0. 

Show that fi = (— 1, 0)^, and £2 = {—y + 2, 0, 1)^ form a basis for 
M as an ii-module in this case. 

c. Generalizing the previous part, show that if ii = fc[xi, . . . , Xn], and one 

of the coefficients in aiXi H h dmXm = 0 is a non-zero constant, 

then the module M of solutions is a free i?-module. 

It can be difficult to determine whether a submodule of is free. For 
example, the following, seemingly innocuous, generalization of Exercise 9 
follows from the solution in 1976 by Quillen [Qui] and Suslin [Sus] of a 
famous problem raised by Serre [Ser] in 1954. We will have more to say 
about this problem in the Exercises 25-27 and later in this chapter. 

(1.8) Theorem (Quillen-Suslin). Let R = k[xi, . . . ,Xn] and sup- 
pose that ai , . . . , G R are polynomials that generate all of R ( that 
is (ai,...,a^) = (1) = R). Then the module M of all solutions 
(Xi, . . . , XmY ^ linear equation 

a\Xi + • • • + ajnXm ~ 0 



is free. 

In 1992, Logar and Sturmfels [LS] gave an algorithmic proof of the 
Quillen-Suslin result, and in 1994 Park and Woodburn [PW] an algorithmic 
procedure that allows one to compute a basis of ker A where A is an ex- 
plicitly given unimodular row. The procedure depends on some algorithms 
that we will outline later in this chapter (and is quite complicated). 

Exercise 10. 

a. Let ai, . . . , am G R. Show that the homomorphism RJ^ — > R given by 
matrix multiplication f Af by the row matrix A = {a\ • • • dm) 

is onto if and only if (ai, . . . , dm) = R- Hint: A is onto if and only if 
1 G im(i4). Such a matrix is often called a unimodular row. 
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b. Show that Theorem (1.8) generalizes Exercise 9c. 

c. Let R = k[x^ y] and consider the equation 

(1 + x)Xi + (1 - y)X 2 + (a; + xy)Xs = 0. 

That is, consider ker A in the special case A — {ai a 2 cls) = 
{1 + X I - y X A xy). Show 1 G (ai, a2, as). 

d. Theorem (1.8) guarantees that one can find a basis for M = ker A in 
the special case of part c. Try to find one. Hint: This is hard to do 
directly — feel free to give up after trying and to look at Exercise 25. 

e. In Exercise 25, we will show that the “trivial” relations. 




generate ker^. Assuming this, show that {hi,h2,h3} is not linearly 
independent and no proper subset generates. This gives an example of 
a minimal set of generators of a free module that does not contain a 
basis. 

The fact that some modules do not have bases (and the fact that even 
when they do, one may not be able to find them) raises the question of 
how one explicitly handles computations in modules. The first thing to 
note is that one not only needs a generating set, but also the set of all 
relations satisfied by the generators — otherwise, we have no way of knowing 
in general whether two elements expressed in terms of the generators are 
equal or not. 

For instance, suppose you know that M is a Q[a:, yj-module and that 
/i, /2, /s is a generating set. If someone asks you whether 4/i + 5/2 4- 6/3 
and fi + 3/2 + 4/3 represent the same element, then you cannot tell unless 
you know whether the difference, 3/i + 2/2 H- 2/3, equals zero in M. To 
continue the example, if you knew that every relation on the /i, /2, /a was 
a Q[a;, y]-linear combination of the relations 3/i + (1 + x)f 2 = 0, /i + 
{2x + 3)/2 + 42//3 = 0, and (2 - 2 x)f 2 + 4/3 = 0, then you could settle 
the problem provided that you could decide whether 3/i + 2/2 4- 2/3 = 0 
is a Q[x, y]-linear combination of the given relations (which it is). 

Exercise 11. Verify that (no matter what fi are), if every linear relation 
on the /i, /2, fs is a Q[a;, i/]-linear combination of the relations 3/i 4- (1 4- 
x)f 2 = 0, /i 4- {2x 4- 3)/2 4- ^yfs = 0 and (2 - 2x)f2 4- 4/3 = 0, then 
3/1 4- 2/2 + 2/3 = 0 is a Q[x, y]-linear combination of the given relations. 

It is worthwhile to say a few more words about relations at this point. 
Suppose that F = (/i, . . . , /t) is an ordered t-tuple of elements of some 
i?-module M, so that /i, . . . , /t G M. Then a relation on F is an ii-linear 
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combination of the fi which is equal to 0: 

ai/i + • • • + atft = 0 G M. 

We think of a relation on F as a t-tuple (ai, . . . , o^) of elements of R. 
Equivalently, we think of a relation as an element of R^. Such relations are 
also called syzygies from the Greek word av(^v'yia meaning “yoke” (and 
“copulation”). In fact we have the following statement. 

(1.9) Proposition. Let (/i, . . . , ft) be an ordered t-tuple of elements fi G 
M. The set of all (ai, . . . , at)^ G R* such that a\fi -f • • • + atft = 0 is 
an R-suhmodule of Rf , called the (first) syzygy module of {fi, , /*), and 
denoted Syz(/i , .... ft)- 

Proof. Let (ai, . . . , at)^, (6i, . . . , be elements of Syz(/i, . . . , /t), 
and let c e R. Then 

(^ifi H" • • • + atft = 0 
bifi H \-btft = 0 

in M. Multiplying the first equation on both sides by c G F, adding to the 
second equation and using the distributivity properties from the definition 
of a module, we obtain 

(cai + &i)/i + • * • {cat -|- bt)ft = 0. 

This shows (cai H- 6i, . . . , ca^ bt)^ is also an element of Syz(/i , ..., ft). 
Hence Syz(/i , ... ^ ft) is a submodule of R^. □ 

This proposition allows us to be precise about what it means to “know” 
all relations on a fixed set of generators of a module. If there are t genera- 
tors, then the set of relations is just a submodule of R^. In Exercise 32 (and 
in the next section), we will show that any submodule of and hence 
any syzygy module, is finitely generated as a module, provided only that 
every ideal of R is finitely generated (i.e., provided that R is Noetherian). 
Hence, we “know” all relations on a set of generators of a module if we can 
find a set of generators for the first syzygy module. 

Since we think of elements of R^ as column vectors, we can think of 
a finite collection of syzygies as columns of a matrix. If M is a module 
spanned by the t generators fi, ^ ft, then a presentation matrix for M is 
any matrix whose columns generate Syz(/i , ..., ft) C R*. So, for example, 
a presentation matrix A for the module of Exercise 11 would be 

/ 3 1 0 \ 

A — I \ -\- X 2x -f- 3 2 — 2x I . 

V 0 4y 4 y 

If is a presentation matrix for a module M with respect to some gen- 
erating set of M, then we shall say that A presents the module M. Note 
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that the number of of rows of A is equal to the number of generators in 
the generating set of M. The following proposition is easy to prove, but 
exceedingly useful. 



(1.10) Proposition. Suppose that A is anlxm matrix with entries in R, 
and suppose that A is the presentation matrix for two different R-modules 
M and N. Then 

a. M and N are isomorphic as R-modules 

b. M (and, hence, N) is isomorphic to R^ fAR^ where AR^ denotes the 
image im A of R^ under multiplication by A. 

Proof. Part b clearly implies part a, but it is more instructive to prove 
part a directly. Since A is a presentation matrix for M, there is a set of 
generators mi, . . . , such that the columns of A generate the module of 
syzygies on mi, . . . , m/. Similarly, there is a set of generators ni, . . . , n/ 
of N such that the columns of A generate Syz(ni, . . . , n^). Define a ho- 
momorphism (p : M N by setting p{mi) = Ui and extending linearly. 
That is, for any ci, . . . , c/ G R, set Qmi) = We leave it to 

the reader to show that (p is well-defined (that is, if ^ Cimi = dimi in 
M for di, . . . , d/ G i2, then CiTUi) = dimi)) and one-one. It is 
clearly onto. 

To show part b, note that if A is an Z x m matrix, then AR^ is the sub- 
module of R^ generated by the columns of A. The quotient module R} /ARJ^ 
is generated by the cosets ei -f AR^, •••,©/ + AR^ (where ei, . . . , 
denotes the standard basis of unit vectors in R^), and (ci,...,c/)^ G 
Syz(ei 4- AR"^, . . . , mei + AR^) if and only if (ci, . . . , c/)^ G AR^ if 
and only if (ci, . . . , c/)^ is in the span of the columns of A. This says that 
A is a presentation matrix for R^/AR^. Now apply part a. □ 

The presentation matrix of a module M is not unique. It depends on the 
set of generators that one chooses for M, and the set of elements that one 
chooses to span module of syzygies on the chosen set of generators of M. 
We could, for example, append the column (3, 2, 2)^ to the matrix A in 
the example preceding Proposition (1.10) above to get a 3 x 4 presentation 
matrix (see Exercise 11) of the same module. For a rather more dramatic 
example, see Exercise 30 below. In the exercises, we shall give a charac- 
terization of the different matrices that can present the same module. The 
following exercise gives a few more examples of presentation matrices. 



Exercise 12. Let R = k[x,y]. 

a. Show that 2x1 matrix f q 1 presents the R module k[y] 0 k[x, y] where 



k[y] is viewed as an i?-module by defining multiplication by x to be 0. 
b. What module does the 1x2 matrix {x 0 ) present? Why does the 
1x1 matrix (a?) present the same module? 
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c. Find a matrix which presents the ideal M = C i? as an R- 

module. 

The importance of presentation matrices is that once we have a presen- 
tation matrix A for a module M, we have a concrete set of generators and 
relations for M (actually for an isomorphic copy of M), and so can work 
concretely with M. As an example, we characterize the homomorphisms of 
M into a free module. 

(1.11) Proposition. If A is anlxm presentation matrix for an R-module 
M, then any R-module homomorphism (p : M —> R^ can be represented by 
atxl matrix B such that BA = 0, where 0 denotes thetxm zero matrix. 
Conversely^ if B is any t x I matrix with entries in R such that BA = 0, 
then B defines a homomorphism from M to R*. 

Proof. To see this, note that for M to have an / x m presentation matrix 
means that M can be generated by I elements /i, . . . , say. Hence, (p is 
determined by </?(/i), . . . , p{fi), which we think of as columns of the t x I 
matrix B. We leave it as an exercise to show that ip is well-defined if and 
only if BA = 0. 

Conversely, if A is a presentation matrix of M with respect to a gener- 
ating set {/i, and if B is any t x I matrix with entries in R such 
that BA = 0, then B defines a homomorahism from M to Rl by mapping 
X] Cirui to Be where c = ( ci • • • ci)GR^. Again, we leave the proof 
that the homomorphism is well-defined if BA = 0 as an exercise. □ 

Additional Exercises for §1 

Exercise 13. The ring k[x, y] can be viewed as a fc-module, as a k[x]- 
module, as a fc[ 2 /]-module, or as a fc[x, yj-module. Illustrate the differences 
between these structures by providing a nontrivial example of a map from 
k[x,y] to itself which is 

a. a fc-module homomorphism, but not a fc[a;]-module, fc[ 2 /]-module, or 
k[x, 2 /]-module homomorphism, 

b. a fc[x]-module homomorphism, but not a fc[y]-module, or k[x, y]-module 
homomorphism, 

c. a fc[ 2 /]-module homomorphism, but not a A:[a;]-module, or fc[rr, 2 /]-module 
homomorphism, 

d. a ring homomorphism, but not a fc[a;, y]-module homomorphism. 

Exercise 14. Let ATi, N 2 be submodules of an ii-module M. 

a. Show that Ni N 2 = {fi f 2 ^ M : fi e Ni} is also a submodule of 
M. 

b. Show that N\ fl A 2 is a submodule of M. 
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c. If Ni and N 2 are finitely generated, show that N\ + N 2 is also finitely 
generated. 

Exercise 15. Show that every free module with a finite basis is isomorphic 
to for some m. One can actually show more: namely, that any finitely 
generated free module is isomorphic to K^. See Exercise 19. 

Exercise 16. Prove Proposition (1.5). 

Exercise 17. Let R = k[x,y,z] and let M C he the module described 
in exercise 2. Explicitly describe all homomorphisms M RK Hint: The 
set of relations on fi, £ 2 , fa is generated by a single element which you can 
easily find. 

Exercise 18. Complete the proof of Proposition (1.11). 

Exercise 19. Let R = k[xi , . . . , Xn]. 

a. Show that if -4 = (aij) is any invertible s x s matrix with coefficients 
in fc, then the vectors 

fi ~ “h * * ' “i" 

z = 1, . . . , 5 also form a basis of the free module R^. 

b. Show that a finitely generated module N over a ring R is free if and 
only if N is isomorphic to M = as a module, for some s. (In view 
of Exercise 13, the point is to show that if a module is free and has a 
finite set of generators, then it has a finite basis.) 

c. Show that A = {dij) is an invertible s x s matrix with coefficients in 
R if and only if det A is a non-zero element of fc. Repeat part a with A 
invertible with coeficients in R. Hint: Consider the adjoint matrix of A 
as defined in linear algebra. 

Exercise 20. Let M and N be i?-modules. 

a. Show that the set hom(M, N) of all i?-module homomorphisms from M 
to N is an ii-module with a suitable definition of addition and scalar 
multiplication. 

b. If M is presented by a matrix A, and N is presented by a matrix B, what 
conditions must a matrix C representing a homomorphism : M ^ N 
satisfy? Hint: Compare with Proposition (1.11). 

c. Find a matrix D presenting hom(M, N). 

Exercise 21. Suppose that M, N are ii-modules and N C M. 

a. Show that the mapping u : M M/N defined by u{f) — [f] = f + N 
is an ii-module homomorphism. 

b. Let (p : M N. Show that there is an ii-module isomorphism between 
M/ kei{p) and im{(p). 
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Exercise 22. Let Ni and N 2 be submodules of an i?-module M, and 
define 



{Ni:N 2 ) = {aeR:af eNi for all / G iVa}. 

Show that {Ni:N 2 ) is an ideal in R. The ideal (0:N) is also called the 
annihilator of iV, denoted ann(iV). 

Exercise 23. 

a. Let M be an ii-module, and let / C i? be an ideal. Show that IM = 
{af : a e I, f e M} is a submodule of M. 

b. We know that M/IM is an R-module. Show that M/IM is also an 
i?//-module. 

Exercise 24. 

a. Let L, M, N be i?-modules with L C M C N. Describe the homomor- 
phisms which relate the three quotient modules and show that N/M is 
isomorphic to (N/L)/{M/L). 

b. Let M, N be submodules of R. Show that (M + iV)/iV is isomorphic to 
M/(M n N). 

(Note: The result in part a is often called the Third Isomorphism The- 
orem and that in part b the Second Isomorphism Theorem. The First 
Isomorphism Theorem is the result established in Exercise 21a.) 

Exercise 25. This is a continuation of Exercise 10. We let R = fe[x, y] 
and consider the equation 

(1 + x)X\ 4- (1 — y)X 2 -f (rr + xy)X^ = 0. 

That is, we consider ker A in the special case A = {ai 02 cls) = 
(1 + x 1-2/ X -{-xy). Since 1 E (ai, U 2 , as) (part c of Exercise 10), 
Theorem (1.8) guarantees that one can find a basis for M = ker^4 in the 
special case of Exercise 10c. We find a basis for M as follows. 

a. Find a triple of polynomials f = (/i, / 2 , fs)^ ^ R^ such that (1 + a:)/i -h 
(1 - y)f 2 + (x + xy)fs = 1. 

b. By multiplying the relation Af = 1 in e) above by 1 -f x and transposing, 
then by 1 — 2 / and transposing, and finally by x xy and transposing, 
find three vectors gi,g2,g3 ^ ker A (these vectors are the columns of 
the 3x3 matrix / - f • A, where I is the 3x3 identity matrix). Show 
these vectors span ker A. Hint: If Af = 0, then f = (/ — f • A)f is a 
linear combination of the colums of / — f • A. 

c. Show that {gi, g 2 } is a basis. 

d. Use part b to show that the “trivial” relations 
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generate ker A. As pointed out in Exercise 10, they supply and an ex- 
ample of a minimal set of generators of a free module that does not 
contain a basis. 

Exercise 26. The goal of this exercise is to show how Theorem (1.8) 
follows from the solution of the Serre problem. An ii-module M is called 
projective if it is a direct summand of a free module: that is, if there is an 
ii-module N such that M 0 iV is a finitely generated free module. In 1954, 
Serre asked whether every projective ii-module when ii is a polynomial 
ring is free and Quillen and Suslin independently proved that this was the 
case in 1976. 

a. Show that Z/6 = Z/3 0 Z/2, so that Z/3 is a projective Z/6- 
module which is clearly not a free Z/6-module. (So, the answer to 
Serre’s question is definitely negative if R is not a polynomial ring 
A:[xi,...,a:„].) 

b. Let R = k[x \^ . . . , and let A = ( ai • • • a/ ) be a 1 x / matrix 
such that 1 G (ai, . . . , a/). Then multiplication by A defines an onto 
map R} — ^ R. Show that (ker A) 0 ii = ii^, so that ker A is projective. 
Hint: Fix f G ii^ such that Af = 1. Given any h G ii^ write h = 
hi + \i 2 (uniquely) with \\2 = (^h)f and hi = h - (Ah)f G ker A. The 
Quillen-Suslin result now implies Theorem (1.8). 

Exercise 27. The purpose of this exercise is to generalize the methods 
of Exercise 25 to further investigate the result of Theorem (1.8). Let R = 
fc[xi, . . . , Xn\ and let A = ( ai • • • a/ ) be a 1 x / matrix such that 
1 G (tti, . . . , ai), 

a. Choose f G ii^ such that Af = 1. Generalize the result of Exercise 25b 
to show that the columns of / — f • A are elements of Syz (ai, . . . , a/) 
that generate Syz (ai, . . . , a^). 

b. Show that one can extract a basis from the columns of / — f • A in the 
case that one of the entries of / is a nonzero element of R. 

c. The preceding part shows Theorem (1.8) in the special case that there 
exists f G such that Af = 1 and some entry of f is a non-zero 
element of fc. Show that this includes the case examined in Exercise 9c. 
Also show that if f is as above, then the set {h £ Rf : Ah = 1} = 
f + Syz(ai,...,az). 

d. There exist unimodular rows A with the property that no f G such 
that Af = 1 has an entry which is a nonzero element of k. (In the 
case R = k[x, y], the matrix A = (l+xy + x^ 2/^ + x- l xy — 1) 
provides such an example.) 

Exercise 28. Let (p : M N he a.n i?-module homomorphism. The 
cokemel of (p is by definition the module coker (<^) = N/im{(p), Show that 
(p is onto if and only if coker ((/?) = {0}. (Note that in terms of this definition. 
Proposition (1.10) says that if M is an i?-module with an / x m presentation 




§1. Modules over Rings 195 



matrix A, show that M is isomorphic to the cokernel of the homomorphism 
from E} to given by multiplication by A.) 

Exercise 29. We have just seen that a presentation matrix determines a 
module up to isomorphism. The purpose of this exercise is to characterize 
the operations on a matrix which do not change the module it presents. 

a. Let A be the m x n matrix representing a homomorphism (p : 

R^ with respect to bases F = of and bases G = 

{gi, • • • 5 9m) of R^. Let F' = (/{, . . . , /4) be another basis of R^ and 
P = the nxn invertible matrix with pij E R such that F = F'P. 
Similarly, let G' — ^ g'^) be another basis of R^ and Q = {qij) 

the m X m invertible matrix with Qij G R such that G = GQ. Show 
that A' = QAP~^ represents (p with respect to the bases F' of R^ 
and G' of R^. Hint: Adapt the proof of the analogous result for vector 
spaces. 

b. If A is an m X n presentation matrix for an F-module M, and if A' = 
QAP~^ with P any n x and Q any m x invertible matrices with 
coefficients in F, show that A' also presents M. 

c. In particular if A' is an m x n matrix obtained from A by adding c, 
c G F, times the ith column of A to the jth column of A, or c times 
the ith row of A to the jih row of A, show that A' and A present the 
same module. Hint: If A' is obtained from A by adding c times the ith 
column of A to the jth column of A then A' = AP where P is the 
m X m matrix with ones along the diagonal and all other entries zero 
except the ijth, which equals c. 

d. If A' is obtained from A by deleting a column of zeroes (assume that A 
is not a single column of zeroes), show that A and A' present the same 
module. Hint: A column of zeroes represents the trivial relation. 

e. Suppose that A has at least two columns and the its jth column is 
(the standard basis vector of with 1 in the ith row and all other 
entries zero). Let A' be obtained from A by deleting the ith row and 
jth column of A. Show that A and A' present the same module. Hint: 
To say that a column of A is ei is to say that the ith generator of the 
module being presented is zero. 

Exercise 30. Let F = fc[x, y] and consider the F-module M presented by 
the matrix 

/ 3 1 0 \ 

A — I \ X 2x ~f" 3 2 — I 
\ 0 4y 4 / 

(compare Exercise 6 and the discussion preceding Exercise 7). Use the 1 
in row 1 and column 2 and elementary row operations to make the second 
column 62 . Use the operation in part e of the preceding exercise to reduce 
to a 2 X 2 matrix. Make the entry in row 2 and column 2 a 1 and use row 
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operations to clear the entry in row 2 column 1, and repeat the operation in 
part e. Conclude that the 1x1 matrix (—8 — 5a; -f 6y{x — 1)) also presents 
M, whence M = k[x, y]/{-S — 5a; + 6y{x — 1)). 

Exercise 31. The purpose of this exercise is to show that two matrices 
present the same module if and only if they can be transformed into one 
another by the operations of Exercise 29. 

a. Let A be a presentation matrix of the i?-module M with respect to 
a generating set /i, . . . , /m- Suppose that gi^ ... ,gs G M and write 
gi = bjifj with bji G R. Let B = (bij). Show that the block matrix 

C -■) 

presents M with respect to the generators (/i, . . . , /m; • • • » 5^s)* 

b. Suppose that gi, ... ,gs also generate M and that A' presents M with 
respect to this set of generators. Write fi — ^ Cjigj and let C = {cij). 
Show that the block matrix 

/A -B / 0\ 

1, 0 I -C A! ) 

presents M with respect to the generators (/i, • . . , /m; 9i? • • • » 9s)- 

c. Show that D can be reduced to both A and to A! by repeatedly applying 
the operations of Exercise 20. Hint: Show that row operations give the 
block matrix 

(A 0 I-BC BA\ 

\0 I -C A' ) 

which reduces by part d of Exercise 20 to the matrix 
(A I-BC BA’). 

Show that the columns of I— BC and of BA’ are syzygies, hence spanned 
by the columns of A. 

d. Show that any presentation matrix of a module can be transformed 
into any other by a sequence of operations from Exercise 29 and their 
inverses. 

Exercise 32. 

a. Show that if every ideal I of R is finitely generated (that is, if R is 
Noetherian), then any submodule M of i?* is finitely generated. Hint: 
Proceed by induction. If t = 1, M is an ideal, hence finitely generated. 
If t > 1, show that the set of first components of vectors in M is an 
ideal in R , hence finitely generated. Suppose that ri E R^l < i < s 
generate this ideal, and choose column vectors /i,' - . - , /s G M with 
first components ri, . . . ,rs respectively. Show that the submodule of 
M’ of M consisting of vectors in M with first component 0 is finitely 
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generated. Show that /i, . . . , together with any generating set of M' 
is a generating set of M. 

b. Show that if R is Noetherian, any submodule of a finitely generated 
ii-module M is finitely generated. Hint: If M is generated by /i, . . . , /s 
it is an image of under a surjective homomorphism. 

Exercise 33. There is another way to view Exercise 31 which is frequently 
useful, and which we outline here. If A and A' are mxt matrices such that 
A' = QAP~^ for an invertible mxm matrix Q and an invertible txt matrix 
P, then we say that A' and A are equivalent Equivalent matrices present 
the same modules (because we can view P G GL(t, R) and Q G GL(m, R) 
as a change of basis in R^ and R^ respectively). 

a. Let .4 be an m X t matrix and A' an r x s matrix with coefficents in R. 
Show that A and A' present identical modules if and only if the matrices 

fA 0 0 0\ , /O 0 Im 0\ 

[o Ir 0 oj [o 0 0 A'J 

are equivalent. Hint: This is equivalent to Exercise 31. 

b. In part a above, show that we can take P = I. 

c. Two matrices A and A' are called Fitting equivalent if there exist identity 
and zero matrices such that 

fA 0 0\ , // 0 0\ 

(o / oj [o A' oj 

are equivalent. Show that A and A' present the same module if and only 
of A and A' are Fitting equivalent. 



§2 Monomial Orders and Grobner Bases for Modules 

Throughout this section R will stand for a polynomial ring k[xi^ . . . , Xn]- 
The goals of this section are to develop a theory of monomial orders in the 
free modules R^ and to introduce Grobner bases for submodules M C R^, 
in order to be able to solve the following problems by methods generalizing 
the ones introduced in Chapter 1 for ideals in R. 

(2.1) Problems. 

a. (Submodule Membership) Given a submodule M C R^ and f G R!^, 
determine i/f G M. 

b. (Syzygies) Given an ordered s-tuple of elements (fi, . . . , f^) of R^ (for 
example, an ordered set of generators), find a set of generators for the 
module Syz(fi, . . . ,fs) C R^ of syzygies. 

One can restate problem 2.1b as that of finding a presentation matrix 
for a submodule of R^. It is easy to see why Grobner bases might be 
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involved in solving the submodule membership problem. When m = 1, 
a submodule of is the same as an ideal in R (see Exercise 4b of §1). 
Division with respect to a Grobner basis gives an algorithmic solution of the 
ideal membership problem, so it is natural to hope that a parallel theory 
for submodules of R^ might be available for general m. In the next section, 
we shall see that Grobner bases are also intimately related to the problem 
of computing syzygies. 

As we will see, one rather pleasant surprise is the way that, once we 
introduce the terminology needed to extend the notion of monomial orders 
to the free modules R^, the module case follows the ideal case almost 
exactly. (Also see Exercise 6 below for a way to encode a module as a 
portion of an ideal and apply Grobner bases for ideals.) 

Let us first agree that a monomial m in R^ is an element of the form 
x^Qi for some i. We say m contains the standard basis vector e^. Every 
element f E can be written in a unique way as a fc-linear combination 
of monomials mi 

n 

f = 

2=1 

where Ci £ k^Ci ^ 0. Thus for example, in k[x, y]^ 



/ 6xy^ - 2/^^ + 3 \ 
f = I + 2y I 

\ 16a; / 




= bxy^ei - y^^ei + 3ei + 4a;^e2 + 2ye2 + 16a;e3, 



which is a fc-linear combination of monomials. The product c • m of a 
monomial m with an element c E A; is called a term and c is called its 
coefficient. We say that the terms c^nii, ci ^ 0, in the expansion of f E RJ^ 
and the corresponding monomials belong to f . 

If m, n are monomials in R^, m = x^Oi, n = then we say that 

n divides m (or m is divisible by n) if and only \i i = j and x^ divides 
x“. If n divides m we define the quotient m/n to be x^fx^ E R (that is, 
m/n = Note that the quotient is an element of the ring i?, and if n 

divides m, we have (m/n) • n = m, which we certainly want. If m and n 
are monomials containing the same basis element e^, we define the greatest 
common divisor, GCD(m, n), and least common multiple, LCM(m, n) to 
be the greatest common divisor and least common multiple, respectively, of 
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and times e^. On the other hand, if m, n contain different standard 
basis vectors, we define LCM(m, n) = 0. 

We say that a submodule M C is a monomial submodule if M can 
be generated by a collection of monomials. As for monomial ideals, it is 
easy to see that f is in a monomial submodule M if and only if every 
term belonging to f is in M. Monomial submodules have properties closely 
paralleling those of monomial ideals. 

(2.3) Proposition. 

a. Every monomial submodule of R^ is generated by a finite collection of 
monomials. 

b. Every infinite ascending chain M\ C M2 C • • • 0/ monomial submodules 
ofR^ stabilizes. That is, there exists N such that Mn = M^v+i = • • • = 
Mn^£ = • • • for all £ > 0. 

c. Let {mi, . . . , m^} be a set of monomial generators for a monomial sub- 

module of R^, and let ci, ..., Ct denote the standard basis vectors in 
Rf. Let mij = LCM(mi,m^). The syzygy module Syz(mi, . . . , m^) 
is generated by the syzygies gij = (mij/mi)ei — for all 

^ ^ i < j t (aij = 0 unless and mj contain the same standard 
basis vector in R^). 

Proof. For part a, let M be a monomial submodule of RJ ^. For each i, 
let Mi = M n Rbi be the subset of M consisting of elements whose jih. 
components are zero for all j 7^ i. In Exercise 5 below, you will show that 
Mi is an i?-submodule of M. Each element of Mi has the form /oj for some 
f e R. By Exercise 2 of §1 of this chapter. Mi = liOi for some ideal li C i?, 
and it follows that L must be a monomial ideal. By Dickson’s Lemma for 
monomial ideals (see, for instance, [CLO], Theorem 5 of Chapter 2, §4), 
it follows that L has a finite set of generators . . . , But then 

the 











.a(ml) 





generate M. 

Part b follows from part a. See Exercise 5 below. 

For part c, first observe that if (ai, . . . , is a syzygy on a collection 
of monomials and we expand in terms of the standard basis in R^: 

0 = aimi H h atint = fiei H h /nCn, 

then /i = ••• = /„ = 0, and the syzygy is a sum of syzygies on subsets 
of the mj containing the same ei. Hence we can restrict to considering 
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collections of monomials containing the same e^: 

mi = . ,rris — 

Now, if (tti, . . . , asY is a syzygy in Syz(mi, . . . , m^), we can collect terms 
of the same multidegree in the expansion a\X^^ + • • • + agX^^ = 0. Each 
sum of terms of the same multidegree in this expansion must also be zero, 
and the only way this can happen is if the coefficients (in k) of those terms 
sum to zero. Hence (ai, . . . , can be written as a sum of syzygies of 
the form 



with Cl , . . . , Cs G k satisfying ci + • • • + c^ =0. Such a syzygy is called 
a homogeneous syzygy^ and we have just shown that all syzygies are sums 
of homogeneous syzygies. (Compare with Lemma 7 of Chapter 2, §9 of 
[CLO].) 

When 5 = 3, for instance, we can write a syzygy 

with Cl + C 2 + C 3 = 0 as a sum: 

(cix^-'^S -Cla;'^~^^ 0)^ H- (0, (ci + 

where {c\x^~^'^ ^ —Cix^~^‘^Y — c\{x^~^^ ^ —x^~^^Y is a syzygy on 
the pair of monomials x^^^x^’^ and ((ci 4- C2)x^~^^ , csX^~^^Y = 
—cs(x^~^^ , is a syzygy on the pair x^^^x^^. 

In fact, for any s, every homogeneous syzygy can be written as a sum 
of syzygies between pairs of monomials in a similar way (see Exercise 5 
below). Also observe that given two monomials x^ and x^ and some x'^ 
that is a multiple of each, then the syzygy (x^“", —x^~^Y is a monomial 
times 



cr = (LCM(a;“, — LCM(a:^, 

Prom here, part c of the proposition follows easily. □ 

If M = (mi, . . . , mt) and f is an arbitrary element of then f e M 
if and only if every term of f is divisible by some mi. Thus, it is easy to 
solve the submodule membership problem for monomial submodules. 

Extending the theory of Grobner bases to modules will involve three 
things: defining orders on the monomials of constructing a division 
algorithm on elements of and extending the Buchberger algorithm to 
R^. Let us consider each in turn. 

The definition of a monomial order on R^ is the same as the definition in 
R (see (2.1) from Chapter 1 of this book). Namely, we say that a ordering 
relation > on the monomials of R^ is monomial ordering if: 
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a. > is a total order, 

b. for every pair of monomials m, n G with m > n, we have x^m > 
x^n for every monomial x^ G iJ, and 

c. > is a well-ordering. 

Exercise 1. Show that condition c is equivalent to x^m > m for all 
monomials m G and all monomials x^ £ R such that x^ ^ 1. 

Some of the most common and useful monomial orders on R^ come by 
extending monomial orders on R itself. There are two especially natural 
ways to do this, once we choose an ordering on the standard basis vectors. 
We will always use the “top down” ordering on the entries in a column: 

ei > 02 > • • • > 

although any other ordering could be used as well. (Note that this is the 
reverse of the numerical order on the subscripts.) 

(2.4) Definition. Let > be any monomial order on R. 

a. (TOPextensionof >) Wesayx^ei >top x^Bjiix^ > x^,orifa;^ = x^ 
and i < j. 

b. (POT extension of >) We say x^Si >poT if i < j, or if z = j and 
x" > x^. 

This terminology follows [AL], Definitions 3.5.2 and 3.5.3 (except for 
the ordering on the e^). Following Adams and Loustaunau, TOP stands 
for “term-over-position,” which is certainly appropriate since a TOP order 
sorts monomials first by the term order on jR, then breaks ties using the 
position within the vector in R^. On the other hand, POT stands for 
“position-over-term.” 

Exercise 2. Verify that for any monomial order > on i?, both >top 
>POT define monomial orders on R^. 

As a simple example, if we extend the lex order on k[x^ y] with x > y to 
a TOP order on k[x, y]^ we get an order >i such that the terms in (2.2) 
are ordered as follows. 




If we extend the same lex order to a POT order on k[x,y]^ we get an order 
>2 such that 
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In either case, we have ei > 02. 

Once we have an ordering > on monomials, we can write any element 
f € as a sum of terms 

t 

f = ^ CiUli 

i=l 

with Ci ^ 0 and mi > m 2 > • • • > mt. We define the leading coefficient, 
leading monomial, and leading term of / just as in the ring case: 

LC>(f) = Cl 
LM>(f) = mi 
LT>(f) = cimi. 

If, for example, 

/ _ yio 4. 3 \ 

f = [ -\-2y I G k[x, y]^ 

\ 16x ) 

as in (2.2), and >top is the TOP extension of the lex order on k[x,y] 
(x > y), then 

hC>TOp{f} 4, LM>^qp(/) X 02, ^"^>TOp(/) ®2* 

Similarly, if > pot is the POT extension of the lex order, then 

LC>pop(/) = 5, LM>poj,(/) = a;j/2ei, LT>po^(/) = Sxy^ei. 

Once we have a monomial ordering in R^ we can divide by a set F C R^ 
in exactly the same way we did in R. 

(2.5) Theorem (Division algorithm in R^). Fix any monomial or- 
dering on R^ and let F = (fi, . . . , fg) be an ordered s-tuple of elements of 
R^. Then every f G RJ^ can be written as 

f = aifi H h Osfs + r, 

where G jR, r G R^, ur{aifi) ^ lt(£) for all i, and either r = 0 or r is 
a k-linear combination of monomials none of which is divisible by any of 
LM(fi), . . . , LM(f5). We call r the remainder on division by F. 

Proof. To prove the existence of the Oi e R and r G R^ it is sufficient 
to give an algorithm for their construction. The algorithm is word-for-word 
the same as that supplied in [CLO], Theorem 3 of Chapter 2, §3, or [AL], 
Algorithm 1.5.1 in the ring case. (The module version appears in [AL] as 
Algorithm 3.5.1). The proof of termination is also identical. □ 



Instead of reproducing the formal statement of the algorithm, we describe 
it in words. The key operation in carrying out the division process is the 




§2. Monomial Orders and Grobner Bases for Modules 203 



reduction of the partial dividend p (p = f to start) by an such that 
LT(fi) divides lt(p). If lt(p) = t • LT(fi) for a term t e we define 

Red (p, fi) = p - tfi 

and say that we have reduced p by One divides f by F by successively 
reducing by the first in the list (that is, the element with the smallest 
index i) for which reduction is possible, and keeping track of the quotients. 
If at some point, reduction is not possible, then lt(p) is not divisible by 
any of the LT(fi). In this case, we subtract the lead term of p, place it into 
the remainder and again try to reduce with the The process stops when 
p is reduced to 0. 

The following exercise gives a simple example of division in the module 
setting. When calculating by hand, it is sometimes convenient to use a 
format similar to the polynomial long division from [CLO] Chapter 2, but 
we will not do that here. 

Exercise 3. Let 

f = (5xy^ - + 3, 4x^ + 2y, 16x)'^ e k[x, j/]^ 

as in (2.2), and let 

fi = {xy + Ax, 0, y^Y 
f2 = (0,2/-l,x-2)^. 

Let > stand for the POT extension of the lex order on A; [a;, y] with x > y. 
Then LT(f) = bx^yei, LT(fi) = xyei, and LT(f2) = ye 2 - Let p be the 
intermediate dividend at each step of the algorithm — set p = f to start 
and Oi = U 2 = 0 and r = 0. 

a. Since LT(fi) divides LT(f), show that the first step in the division will 
yield intermediate values ai = 5t/, U 2 = 0, r = 0, and p = Red (f, fi) = 
{—20xy — y^^ + 3, 4x^ H- 2y, 16x — by^)^. 

b. lt(p) is still divisible by LT(fi), so we can reduce by fi again. Show that 
this step yields intermediate values a\ = by — 20, U 2 = 0, r = 0, and 
p = (80a; - y^^ + 3, 4x^ + 2y, 16a; - by^ + 20y^)^. 

c. Show that in the next three steps in the division, the leading term of 
p is in the first component, but is not divisible by the leading term of 
either of the divisors. Hence after three steps we obtain intermediate 
values ai = by — 10, U 2 = 0, r = (80a; — y^^ + 3, 0, 0)^, and p = 
(0, 4x^ + 2y, 16a; - by^ -f 20y^)'^. 

d. The leading term of p at this point is 4 x^G 2, which is still not divisible 
by the leading terms of either of the divisors. Hence the next step will 
remove the term 4a;^e2 and place that into r as well. 

e. Complete the division algorithm on this example. 

f. Now use the TOP extension of the lex order and divide f by (fi,f 2 ) 
using this new order. 
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The division algorithm behaves best when the set of divisors has the 
defining property of a Grobner basis. 

(2.6) Definition. Let M be a submodule of and let > be a monomial 
order. 

a. We will denote by (lt(M)) the monomial submodule generated by the 
leading terms of all f G M with respect to >. 

b. A finite collection Q = {gi, . . . , gs} C M is called a Grobner basis for 
M if (lt(M)) = (LT(gi), . . . , LT(gs)>. 

The good properties of ideal Grobner bases with respect to division 
extend immediately to this new setting, and with the same proofs. 

( 2 . 7 ) Proposition. Let Q be a Grobner basis for a submodule M C R^, 
and let f G R^. 

a. f G M i/ and only if the remainder on division by Q is zero. 

b. A Grobner basis for M generates M as a module: M = (Q ) . 

Part a of this proposition gives a solution of the submodule membership 
problem stated at the start of this section, provided that we have a Grobner 
basis for the submodule M in question. For example, the divisors fi, £2 in 
Exercise 3 do form a Grobner basis for the submodule M they generate, 
with respect to the POT extension of the lex order. (This will follow from 
Theorem (2.9) below, for instance.) Since the remainder on division of f is 
not zero, f ^ M. 

Some care must be exercised in summarizing part b of the proposition 
in words. It is not usually true that a Grobner basis is a basis for M as 
an i?-module — a Grobner basis is a set of generators for M, but it need 
not be linearly independent over R. However, Grobner bases do exist for 
all submodules of by essentially the same argument as for ideals. 

Exercise 4. By Proposition (2.3), (lt(M)) = {mi,...,mt) for some 
finite collection of monomials. Let G M be an element with Lx(fi) = m^. 

a. Show that {fi, . . . , f^} is a Grobner basis for M. 

b. Use Proposition (2.7) to show that every submodule of R^ is finitely 
generated. 

Reduced and monic Grobner bases may be defined as for ideals, and there 
is a unique monic (reduced) Grobner basis for each submodule in R^ once 
we choose a monomial order. 

Now we turn to the extension of Buchberger’s Algorithm to the module 
case. 

( 2 . 8 ) Definition. Fix a monomial order on and let f, g G R^. The 
S-vector of f and g, denoted 5(f, g), is the following element of R^. Let 
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m = LCM(LT(f), LT(g)) as defined above. Then 



5(f,g) 



m 



f- 



m 



LT(f) LT(g) 



For example, if f = {xy — y)^ and g = + 2y‘^, x^ — y^)^ in 

fc[x, y]^ and we use the POT extension of the lex order on fc = [x^ y] with 
X > y, then 

5(f , g) = xf - yg 

= {-x^ - 2y^, x^ - + xy + y^)^ 

The foundation for an algorithm for computing Grobner bases is the 
following generalization of Buchberger’s Criterion. 



(2.9) Theorem (Buchberger’s Criterion for Submodules). A set 

Q = {gi, . . . , gs} C is a Grobner basis for the module it generates if 
and only if the remainder on division by Q of 5(gi, gj) is 0 for all i^j. 

Proof. The proof is essentially the same as in the ideal case. □ 



For example, Q = {fi,f 2 } from Exercise 5 is a Grobner basis for the 
submodule it generates in fc[x,y]^, with respect to the POT extension of 
the lex order. The reason is that the leading terms of the fj contain different 
standard basis vectors, so their least common multiple is zero. As a result, 
the S'- vector satisfies S(fi, £ 2 ) = 0, and Buchberger’s Criterion implies that 
^ is a Grobner basis. 

For a less trivial example, if we define M to be the matrix 

, , _ / -f — 2bcd a — b\ 

~ \c^ — 6^ -h acd c-\- d J ^ 

over R = k[a, 6, c, d] then a TOP grevlex Grobner basis Q for the submodule 
generated by the columns of M has four elements: 

gi = (6^, — ac/2 — bc/2 -f- c^/2 — ad/2 — bd/2 — d^/2)^, 
g 2 = (a - 6, c + 

(2.10) g 3 = (-26cd, b^ - abc/2 + b‘^c/2 - ac^ -f bc^/2 - abd/2 -h b^d/2 
+ acd + — bd^/2)^ 

g4 = (0, a^c + b^c — ac^ -f bc^ -h a^d -h b^d + ad^ — bd^)^. 

Note that LT(gi) = 6^ei and Lx(g 2 ) = aei for the TOP extension of grevlex 
on A;[a, 6, c, d]. Hence 

S'(gi,g2) = agi - &^g2 

= (6^, —a^c/2 — abc/2 -f a<? /2 — a^d/2 — abd/2 — ad^/2 
— b^c — b^d)^ 

= &gi - (l/2)g4, 
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so that ^(gi, g 2 ) reduces to 0 on division by Q. It is easy to check that all 
the other 5- vectors reduce to 0 modulo Q as well. 

To compute Grobner bases, we need a version of Buchberger’s Algorithm. 
Using Theorem (2.9), this extends immediately to the module case. 

(2.11) Theorem (Buchberger’s Algorithm for Submodules). Let 

F = (fi, . . . ,ft) where fi G K^, and fix a monomial order on R^. The 
following algorithm computes a Grobner basis Q for M = (F) C where 
Q' 

denotes the remainder on division by Q' , using Theorem (2.5): 
Input: F = (fi, . . . , ft) C R^, an order > 

Output: a Grobner basis ^ for M = (F), with respect to > 
G:=F 
REPEAT 
Q' := G 

FOR each pair f ^ g in DO 
S:= W^' 

IF 5 ^ 0 THEN G :=gu {5} 

UNTIL G = 

Proof. Once again, the proof is the same as in the ideal case, using the 
fact from Proposition (2.3) that the ascending chain condition holds for 
monomial submodules to ensure termination. □ 

Unfortunately, the Grobner basis packages in Maple and Mathematica do 
not allow computation of Grobner bases for submodules of R^ for m > 1 by 
the methods introduced above. The CALI package for REDUCE, CoCoA, 
Singular and Macaulay do have this capability however. For instance, 
the Grobner basis in (2.10) was computed using the implementation of 
Buchberger’s Algorithm in the computer algebra system Macaulay (though 
in this small example, the computations would also be feasible by hand). In 
Exercise 8 below, you will see how the computation was done. In Exercise 
9, we re-do the computation using the computer algebra system Singular. 
Exercise 10 presents an additional application of the techniques of this 
section — computation in the quotient modules R^fM. 



Additional Exercises for §2 

Exercise 5. This exercise will supply some of the details for the proof of 
Proposition (2.3). 

a. Show that if M is a submodule of R^ and Mi = M C\ Re^, then Mi is 
a submodule of M. 




§2. Monomial Orders and Grobner Bases for Modules 207 



b. Using part a of Proposition (2.3), show that monomial submodules of 
satisfy the ascending chain condition. That is, for every infinite 
ascending chain M\ C M 2 C • • • of monomial submodules of there 
exists N such that M/v = = • • • = Mn^£ for all ^ > 0. Hint: 

Consider which is also a monomial submodule. 

Exercise 6. In this exercise, we will see how the theory of submodules of 
can be “emulated” via ideals in a larger polynomial ring obtained by 
introducing additional variables Xi, . . . , Xm corresponding to the standard 
basis vectors in R^. Write S = k[xi, . . . , Xi, . . . , X^], and define a 
mapping (p : ^ 5 as follows. For each f G R^, expand f = ? 

where fj G ii, and let F = (p{{) G 5 be the polynomial F = Z)j=i fj^j- 

a. Show that S can be viewed as an i?-module, where the scalar mul- 
tiplication by elements of R is the restriction of the ring product in 
S. 

b. Let 5i C 5 denote the vector subspace of S consisting of polynomials 
that are homogeneous linear in the Xj (the fc-span of the collection of 
monomials of the form x^Xj). Show that 5i is an i?-submodule of S. 

c. For each submodule M = (fi, . . . , fg) C let Fi = G S. Show 
that (f{M) equals (Fi, . . . , Fs)5 H 5i, where (Fi, . . . , Fs)5 denotes the 
ideal generated by the Fi in S. 

d. Show that a Grobner basis for the module M could be obtained by 
applying a modified form of Buchberger’s Algorithm for ideals to / = 
(Fi, . . . , Fs)^. The modification would be to compute remainders only 
for the 5-polynomials 5(Fi, Fj) that are contained in Si, and ignore all 
other 5-pairs. 

Exercise 7. Let R = k[x], the polynomial ring in one variable. Let M 
be a submodule of R^ for some m > 1. Describe the form of the unique 
monic reduced Grobner basis for M with respect to the POT extension of 
the degree order on R. In particular, how many Grobner basis elements 
are there whose leading terms contain each e^? What is true of the ith 
components of the other basis elements if some leading term contains e^? 

Some Basic Information About Macaulay, 

Since we have not used Macaulay before in this book, and since it is 
rather different in design from general computer algebra systems such as 
Maple, a few words about its set-up are probably appropriate at this point. 
For more information on this program, we refer the reader to the man- 
ual that is available together with the program via anonymous ftp from 
zariski.harvard.edu. Macaulay is a computer algebra system specifically 
designed for computations in algebraic geometry and commutative algebra. 
Its basic computational engine is a full implementation of Buchberger’s 
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algorithm for modules over polynomial rings. Built-in commands for ma- 
nipulating ideals and submodules in various ways, performing division as in 
Theorem (2.5), computing Grobner bases, syzygies, Hilbert functions, free 
resolutions (see Chapter 6 of this book), displaying results of computations, 
etc. as well as “scripts” for many other geometric operations are provided. 
A complete list of the available commands can be obtained by entering the 
command commands at the % prompt in a Macaulay session, and typing a 
command name will generate a help listing describing the correct format 
for the command. 

In using Macaulay^ two of its key features must be kept in mind. First, 
to speed up computation and minimize memory requirements, all com- 
putations in Macaulay are done with coeflScients in a (finite) prime field 
k = (p). The default is p = 31991; the user can also specify a different p 

if desired. This means that Macaulay is less appropriate for some applica- 
tions (especially elimination-based equation solving over Q). But for other 
computations in algebraic geometry, it can be extremely useful because 
its Grobner basis implementation is much faster than that in Maple for 
example. Second, all ideals are homogeneous and all modules are graded 
modules in Macaulay. That is, all generators for an ideal in R must be ho- 
mogeneous polynomials. Moreover, all generators for a submodule in 
must be column vectors of homogeneous polynomials of the same total de- 
gree with respect to some grading. Hence examples with non-homogeneous 
polynomials are somewhat more difficult to deal with in Macaulay. We will 
discuss these matters in more detail in Chapter 6. 

Before working with a submodule of R^ in Macaulay^ the base ring R 
must be defined. The most succinct way to do this is to use the <ring script. 
Part of the definition of a base ring is a monomial order. The default (which 
is what you get via the <ring script) is the grevlex order on the polynomial 
ring, and a TOP extension to submodules in R^ with the standard basis 
ordered Qm > *•• > ei, as in Definition (2.4). (There is also a ring 
conunand which allows you to specify other monomial orders.) 

Exercise 8. 

a. In these examples, we will work over R = fc[a, 5, c, d\ with k = 
Z/(31991), so enter 



<ring 4 a-d R 

at the % prompt in a Macaulay session, 
b. To define the submodule generated by the columns of 

, , _ / 4- 6^ a^ — 2bcd a - b\ 

~ \c^ — (P 6^ + acd c-\- d J ' 

enter the command mat M, then the numbers of rows and columns and 
the entries as prompted. 
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c. The std command (“standard” basis) computes the Grobner basis for 
the module generated by the columns of the matrix. Use it to compute 
the Grobner basis for M and put the result in a new matrix called MM. 

d. To see the results, enter put std MM. Compare with the Grobner basis 
given above in (2.10) — they should be the same. Note: even though the 
computation was done over k = Z/(31991), the results are valid with 
fc = Q as well in this case. This will not always be truel 

Using Singular to Compute in Modules. 

We used Singular in the last chapter for computations in local rings. 
This very powerful program is also very well-suited for computations in 
modules over polynomial rings. We demonstrate by redoing Exercise 8 us- 
ing Singular. Unlike Macaulay^ Singular does not require that ideals be 
homogeneous or modules graded. It also allows one to work in character- 
istic zero (although complicated computations go faster over finite prime 
fields). 

Exercise 9. 

a. We will work over R = fc[a, 6, c, d] with fc = Q, so enter 

ringR=0, (a,b,c,d), (dp,C); 

at the > prompt in a Singular session. The first term “ring R=0” as- 
serts that R will be a ring with characteristic 0 and the second that R 
have indeterminates a, 6, c, d. Had we wanted k = Z/31991 and indeter- 
minates r, y, 2;, we would have entered “ring R=31991 , (x,y ,z)” at the 
prompt >. The third term specifies the ordering. Examples of possible 
well-orderings on R are Zer, grevlex^ and grlex^ specified by Ip, dp and 
Dp respectively. In our case, we chose grevlex. The letter C indicates the 
“top down” order ei > C2 > . . . > on the standard basis elements 
of the free module RJ^. The lower-case c indicates the reverse, “bottom 
down” order > . . . > 62 > ei on basis elements. The pair (dp, C) 
indicates the TOP extension of dp to RT^ using the top-down order on 
standard basis elements. This is the ordering we used in Exercise 8. Had 
we wanted a POT extension, we would have written (C, dp). A POT 
extension of a pure lex ordering on R to R]^ with the bottom down 
ordering Bm > • • • > gi would be specified by entering (c. Ip). 

b. To define the submodule M generated by the columns of the matrix M 
in Exercise 9b above, enter, for example, 

> vector si = [a2 + b2, c2 — d2]; 

> vector s2 = [a3 — 2bcd, b3 + acd]; 

> vector s3 = [a — b, c + d]; 

> module M = si, s2, s3; 
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(We have shown the prompt > which you should not re-enter.) Note that 
the command vector si = [a2+b2 , c2“d2] ; defines the vector si = 

c. To define a module N generated by a Grobner basis of si, s2, s3, enter 

module N = std(M); 

after the prompt >. 

d. To see the result, type N; after the prompt >. Verify that you get the 
same result (up to multiplication by 2) as in (2.10). 

e. In addition to the TOP, top down extension of graded reverse lex, exper- 
iment with the following different extensions of the graded reverse lex 
order on R to the free modules R!^: POT and bottom down; TOP and 
bottom down; POT and top down. For which of these does the Grobner 
basis of M = (si, s2, s3) have the fewest number of elements? the least 
degree? What about different extensions of the lex order on R1 

Exercise 10. In this exercise, we will show how Grobner bases can be 

applied to perform calculations in the quotient modules RJ^ jM for M C 

RT, 

a. Let ^ be a Grobner basis for M with respect to any monomial order on 

Use Theorem (2.5) to define a one-to-one correspondence between 
the cosets in R^fM and the remainders on division of f e R^ by Q. 

b. Deduce that the set of monomials in the complement of (lt(M)) forms 
a vector space basis for R^fM over k, 

c. Let R = A:[a, b, c, d\. Find a vector space basis for the quotient module 
R? jM where M is generated by the columns of the matrix from Exercise 
8, using the TOP grevlex Grobner basis from (2.10). (Note: R? jM is not 
finite-dimensional in this case.) 

d. Explain how to compute the sum and scalar multiplication operations in 
RT^ jM using part a. Hint: see Chapter 2, §2 of this book for a discussion 
of the ideal case. 

e. R = k[xi, . . . , Xn]- State and prove a criterion for finite-dimensionality 
of R^/M as a vector space over k generalizing the Finiteness Theorem 
from Chapter 2, §2 of this book. 



§3 Computing Syzygies 

In this section, we begin the study of syzygies on a set of elements of a 
module, and we shall show how to solve Problem (2.1) b of the last section. 
Once again R will stand for a polynomial ring k[xi , . . . , o^n]- Solving this 
problem will allow us to find a matrix presenting any submodule of R^ for 
which we know a set of generators. 

Grobner bases play a central role here because of the following key ob- 
servation. In computing a Grobner basis G = • • • , S's} for an ideal 
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I C R with respect to some fixed monomial ordering using Buchberger’s 
algorithm, a slight modification of the algorithm would actually compute a 
set of generators for the module of syzygies Syz(^i, . . . , as well as the 
Qi themselves. The main idea is that Buchberger’s AS-polynomial criterion 
for ideal Grobner bases is precisely the statement that a set of generators 
is a Grobner basis if and only if every homogeneous syzygy on the leading 
terms of the generators “lifts” to a syzygy on the generators, in the sense 
described in the following theorem. The “lifting” is accomplished by the 
division algorithm. 

To prepare for the theorem, let S{gi^ gj) be the 5-polynomial of gi and 
9f 

where is the least common multiple of Lu{gi) and LM(^fj) (see (2.2) 
of Chapter 1 of this book). Since ^ is a Grobner basis, by Buchberger’s 
Criterion from §3 of Chapter 1, the remainder of S{gi, gj) upon division by 
Q is 0, and the division algorithm gives an expression 

s 

k=l 



where aijk G R, and UT{aijkgk) < LT(5(^i,^j)) for all k. 
Let a^j € R^ denote the column vector 



a^j — “I” ^ij2^2 H“ ’ * ' "t" O^ijs^s — 

and define Sij G R^ by setting 



/ \ 

Otij2 

\ Oiijs J 



G i?" 



( 3 . 1 ) 



“ 7 T 7 T ®jf ^ij 



LT(5,)' 

in R^. Then we have the following theorem of Schreyer from [Schrel]. 



(3.2) Theorem. Let Q = {gi, gs} be a Grobner basis of an ideal I in 
R with respect to some fixed monomial order, and let M = Syz(5 'i, . . . , ^s)- 
The collection {sij, I < i, j < s} from (3.1) generates M as an R-module. 

Part of this statement is easy to verify from the definition of the 

Exercise 1. Prove that s^j is an element of the syzygy module M for all 
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The first two terms 

in expression (3.1) for sij form a column vector which is a syzygy on the 
leading terms of the gi (that is, an element of Syz(LT(^i), . . . , lt(^s))). The 
“lifting” referred to above consists of adding the additional terms —a^j in 
(3.1) to produce a syzygy on the gi themselves (that is, an element of 
Syz(5i,...,5s))- 

A direct proof of this theorem can be obtained by a careful reconsid- 
eration of the proof of Buchberger’s Criterion (see Theorem 6 of [CLO], 
Chapter 2, §6). Schreyer’s proof, which is actually significantly simpler, 
comes as a byproduct of the theory of Grobner bases for modules, and it 
establishes quite a bit more in the process. So we will present it here. 

First, let us note that we can parallel in the module case the observations 
made above. Let G = {gi, • • • , gs} be a Grobner basis for any submodule 
M C with respect to some fixed monomial order >. Since ^ is a 
Grobner basis, by Theorem (2.9) now, the remainder of 5(gi, gj) on division 
by ^ is 0, and the division algorithm gives an expression 



) gj ) ~ ^ ^ OjijkEk J 
k—1 

where aijk ^ and m(aijkEk) < Lx(5(gi,gj)) for all k. 

Write ei,...,es for the standard basis vectors in R^, Let mij = 
LCM(LT(gi), LT(gj)), and let a^j G R^ denote the column vector 

3-ij = d" Clij2^2 ’ “h aij^Cg G R • 

For the pairs (i, j) such that ^ 0, define G R^ by setting 

mij 

in and let Sij be zero otherwise. Since a Grobner basis for a module 
generates the module by Proposition (2.7), the following theorem includes 
Theorem (3.2) as a special case. Hence, by proving this more general result 
we will establish Theorem (3.2) as well. 

(3.3) Theorem (Schreyer’s Theorem). Let G C R^ he a Grobner ba- 
sis with respect any monomial order > on R^. The Sij form a Grobner basis 
for the syzygy module M = Syz(gi, . . . , g^) with respect to a monomial or- 
der >g on R^ defined as follows: >g x^ej i/LT>(x"gi) > LT>(x^gj) 

in R^, or z/LT>(a;"gi) = LT>(a:^gj) and i < j. 

Proof. We leave it to the reader as Exercise 1 below to show that >g is a 
monomial order on R^. Since ^(gi, gj) and S{gj, gi) differ only by a sign, 
it suffices to consider the Sij for i < j only. We claim first that if i < j. 
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then 

(3-4) LT>g(Sij) = LT(gi) 

Since we take i < this term is larger than (niij/LT(gj))ej in the >g order. 
It is also larger than any of the terms in a^j, for the following reason. The 
aijk are obtained via the division algorithm, dividing S = S{gi,gj) with 
respect to Q. Hence lt>(S) > LT>(aij^g^) for all ^ = 1, . . . , s (in K^). 
However, by the definition of the S-vector, 

since the 5- vector is guaranteed to produce a cancellation of leading terms. 
Putting these two inequalities together establishes (3.4). 

Now let f = fi€i be any element of the syzygy module M, let 

hT>g (fiCi) = miCi for some term rrii appearing in fi. Further, let lt>^ (f) = 
myCv for some v, and let 

S = ^ ^ '^u^u 
u 

where the sum is taken over all u such that muhT^{gu) = m^LT>(g^;). 

By definition, it follows that s is an element of Syz({LT>(g^ ) : i; > z}). 
By part c of Proposition (2.3) of this chapter, it follows that s is an element 
of the submodule of generated by the 

itCb ^ ~ irr(e ) 

where i < u < v. It follows that LT>(t) is divisible by LT>^(si^) for some 
j > i. So by definition the s^j form a Grobner basis for M with respect to 
the >g order. □ 

Exercise 2. Verify that the order >g introduced in the statement of the 
theorem is a monomial order on R^. 

Theorem (3.3) gives the outline for an algorithm for computing 
Syz(gi, . . . , g«) for any Grobner basis G = {gi, . . . , g^} using the divi- 
sion algorithm. Hence we have a solution of Problem (2.1) b in the special 
case of computing syzygies for Grobner bases. Using this, we will see how 
to compute the module of syzygies on any set of generators {fi, . . . , f*} for 
a submodule of R^. 

So suppose that we are given fi, . . . , ft G and that we compute a 
Grobner basis G = {gi, • • • , g«} for M = (fi, . . . , ft) = (G)- Let F = 
(fi, . . . , ft) and G = (gi, . . . , gs) be the nxt and nx s matrices in which 
the fi’s and g^’s are columns, respectively. Since the columns of F and G 
generate the same module, there are a t x s matrix A and an s x t matrix 
B, both with entries in i?, such that G = FA and F = GB. The matrix 
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B can be computed by applying the division algorithm with respect to 
G, expressing each fi in terms of the gj. The matrix A can be generated 
as a byproduct of the computation of G. This is because each 5-vector 
remainder that is added to G in the basic Buchberger algorithm, Theorem 
(2.11), comes with an expression in terms of the fi, computed by division 
and substitution. However, the matrix A can also be computed in an ad 
hoc way as in simple examples such as the following. 

Suppose, for example, that n = 1, so that M is an ideal, say M = {/i, / 2 ) 
in ii = k[x, y], where 

fi = xy + X, /2 = + 1 - 

Using the lex monomial order with x > y, the reduced Grobner basis for 
M consists of 



9i =X, 52 = + 1 - 



Then it is easy to check that 



/i = (y + i)yi 

51 = -(l/ 2 )(y - l)/i + (1/2 )x/2, 

SO that 

(3.5) G = (51, 52) = (/i, /2) ( 
and 

(3.6) F = (/i,/2) = (51,52) (^ 0 ^ 



If we express G in terms of F using the equation G = FA then substitute 
F = GB on the right, we obtain an equation G = GBA. Similarly, F = 
FAB, What we have done here is analogous to changing from one basis to 
another in a vector space, then changing back to the first basis. However, in 
the general i?-module case, it is not necessarily true that AB and BA equal 
identity matrices of the appropriate sizes. This is another manifestation of 
the fact that a module need not have a basis over R, For instance, in the 
example from (3.5) and (3.6) above, we have 



and 



1 ). 



In addition to connecting F with G, the matrices A and B also connect 
the syzygies on F and G in the following ways. 
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(3.7) Lemma. 

a. Let ^ £ W (a column vector) he an element of Syz(gi, . . . then 
the matrix product As is an element o/Syz(fi, . . . , f*). 

b. Similarly, if t e R* (also a column vector) is an element of 
Syz(fi, . . . , ft), then Bt e is an element of Syz(gi, . . . , gs). 

c. Each column of the matrix It — AB also defines an element of 
Syz(fi,..., m/t). 

Proof. Take the matrix equation G — FA and multiply on the right 
by the column vector s G Syz(gi, . . . , gs). Since matrix multiplication is 
associative, we get the equation 

0 = Gs = FAs = F{As). 

Hence As is an element of Syz(fi , . . . , ft). Part b is proved similarly, starting 
from the equation F = GB. Finally, F = FAB implies 

F{It - AB) = F- FAB = F - F = 0, 

and part c follows immediately. □ 

Our next result gives the promised solution to the problem of computing 
syzygies for a general ordered ^-tuple F = (fi, . . . , ft) of elements of R^ 
(not necessarily a Grobner basis). 

(3.8) Proposition. Let F = (fi, . . . , ft) 6e an ordered t-tuple of elements 
of R^, and let G = (gi, . . . , g«) be an ordered Grobner basis for M = (F) 
with respect to any monomial order in R^, Let A and B be the matrices 
introduced above, and let Sij, I < i, j, < s be the basis for Syz(gi, . . . , gs) 
given by Theorem (3.3) or (3.2). Finally, let Si, . . . , St be the columns of 
the t X t matrix R — AB. Then 

Syz(fi, . . . , ft) = {Asij, Si, , St). 

Proof. {Asij, Si, . . . , St) is a submodule of Syz(fi, . . . , ft), so to prove 
the proposition, we must show that every syzygy on F can be expressed 
as an i?-linear combination of the Asij and the S^. To see this, let t be 
any element of Syz(fi, . . . , ft). By part b of Lemma (3.7), Bt is an element 
of Syz(gi, . . . , gs). Since the Sij are a Grobner basis for that module, and 
hence a generating set for Syz(gi, . . . , g«), there are aij G R such that 

Bt — ^ ^ Oij Sij . 
ij 

But multiplying both sides by A on the left, this implies 

ABt = ajjAsjj, 
ij 
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so that 

t = {{It - AB) + AB)t 

— {If AB^^ “1“ ^ ^ dijAsij. 

ij 

The first term on the right in the last equation is an i2-linear combination 
of the columns Si, . . . , St of (/t — AB), hence t G {Asij, Si, . . . , St). Since 
this is true for all t, the proposition is proved. □ 

Note that the hypothesis in Proposition (3.8) above that G is a Grobner 
basis is needed only to ensure that the g* generate and that the Sij are 
a basis for the module of syzygies. More generally, if we have any set of 
generators for a module M, and a set of generators for the module of 
syzygies of that set of generators, then we can find the a generating set of 
syzygies on any other generating set of M. 

(3.9) Corollary. Let the notation he as above, but assume only that G = 
is a set of generators for M and that D is a matrix presenting 
M, so the columns of D generate Syz(gi, . . . , gs). Then the block matrix 

{AG It-AB) 

presents M with respect to the generating set . . . ,tt. 

Proof. This follows immediately from the proof of Proposition (3.8) 
above. We have also seen this result in part c of Exercise 31 in §1. □ 

As an example of Proposition (3.8), we use the F, G, A, B from (3.5) and 
(3.6) and proceed as follows. Since 

S{gi,92) = J/V - ^92 = -X = -gi, 
by Theorem (3.3) we have that 

generates Syz(i^i, ^ 2 )- Multiplying by A we get 

^ ( -{y^ -y'^ + y- i)/2\ 

(xy2 - x)/2 ) ■ 

Exercise 3. Verify directly that Asi 2 gives a syzygy on (/i, / 2 ). 

Continuing the example, the columns of I 2 — AB are 

/ (y2 + l)/2 \ ^ _(Q\ 
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So by the proposition 



Syz(/i,/2) = (M2 ,Si). 

This example has another instructive feature as shown in the following 
exercise. 



Exercise 4. Show that Asu above is actually a multiple of Si by a non- 
constant element of R. Deduce that Si alone generates Syz(/i, / 2 ), yet Asi 2 
does not. Compare to Exercise 11. 



Hence, the Asij are not alone sufficient to generate the syzygies on F in 
some cases. 

Let us now return to the situation where M is a submodule of and 
fi , . . . , ft and gi , . . . , gs are different sets of generators of M. At first 
glance, Corollary (3.9) seems a little asymmetric in that it privileges the 
g’s over the f’s (in Proposition (3.8), this issue does not arise, because it 
it seems sensible to privilege a Grobner basis over the the set of generators 
from which it was computed). Given presentations for Syz(fi, . . . , ft) and 
Syz(gi, . . . , gs). Exercise 31 of §1 provides a block matrix which presents 
M with respect the combined set of generators fi, . . . , ft, gi, . . . , gs (and 
which reduces to the matrices F and G separately). It is worth pausing, 
and phrasing a result that links the set of syzygies on any two generating 
sets of the same module M. 



(3.10) Proposition. Suppose that fi, . . . , ft and gi, . • . , gs are ordered 
sets of elements of R^ which generate the same module M. Then, there 
are free R-modules L and V such that 

Syz(fi, . . . , ft) © L ^ Syz(gi, . . . , g«) © L'. 

Proof. We claim that N = Syz(fi, . . . , ft, gi, . . . , g^) is a direct sum of 
a module isomorphic to Syz(fi, . . . , ft) and a free module. In fact, N is the 
set of vectors (ci, . . . , ct, di, . . . , d^)^ £ R^'^^ such that 

Cifi + • • • + Ctft + digi + • • • -f- dgfj = 0. 



Now consider the submodule of K C N obtained by taking those elements 
with all di = 0. Note that K is clearly isomorphic to Syz(fi, . . . , ft). More- 
over, since the fj generate M, we can write gi = ^aijfj. Then, each of 
the t vectors = (a^i, . . . , a^t, 0, . . . , 0, —1, 0, . . . , 0)^, with all terms in 
the {t H- j)th place, 0 < j < s, equal to 0 except the {t + k)th term which 
is —1, belongs to N. Moreover, the n^, 1 < fc < t, are clearly linearly 
independent, so they generate a free submodule of AT, which we will call 
L. Clearly jFf D L = 0. To see that AT = AT -f L, suppose that we have an 
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element (ci, . . . , Ct, di, . . . , ds)^ G N. Then 

0 = Cifi -f- • • • + Ctft + diQi + • • • + dfifs 

= Cifi + • • • + Cftt + di ^ ^ + • • • + ds ^ ^ ^sj^j 

SO that 

(ci, . . . , Ct, di, . . . , ds)^ + djiij 

= (ci “h ^ ^ djflrji, . . . , Q -f“ ^ ^ djCLjt^ 0, . . . , 0) , 

j j 

which belongs to K. This proves the claim. Similarly, we show that iV is a 
direct sum of Syz(gi, . . . , gs) and the result follows. □ 

Modules which become isomorphic after addition of free modules are 
often called equivalent. The proposition shows that any two modules of 
syzygies on any diflFerent sets of generators of the same module are equiv- 
alent. We leave it as an exercise to show that any modules of syzygies on 
equivalent modules are equivalent. 

As an application, we will develop a syzygy-based algorithm for com- 
puting ideal and submodule intersections. For ideals, this method is more 
efiicient than the elimination-based algorithm given in Chapter 1, §3, Ex- 
ercise 11. We start with the ideal case. The following statement gives 
a connection between the intersection 7 fl J and syzygies on a certain 
collection of elements of 7J^. 

(3.11) Proposition. Let I = {/i, • • • , A) and J = (^i, . . . , ^s) he ideals 
in R. A polynomial ho E R is an element of I P\J if and only if ho appears 
as the first component in a syzygy 

{ho, hi, . . . ,ht, ftt+i, . . . , G 

in the module 

S = Syz(vo, vi, . . . , vt, vt+i, . . . , v^+t) 

where 

''•“(o)’ 

''■«=(!) 

in 

Proof. Suppose that 

0 = /iqVo + ftiVi + • • • + htVt + + • • • + hs^t^s-\-t- 
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From the first components, we obtain an equation 

0 = /lo "b ^i/i + * * * "I" htft + 0 + • • • + 0, 

so ho G {fi, ft) = /. Similarly from the second components, ho € 
{^1, --^gs) = J’ Hence ho G / D J. 

On the other hand, in Exercise 8 below, you will show that every ho G In 
J appears as the first component in some syzygy on the vo, . . . , Vs-\-t- □ 

Exercise 5. Show that Proposition (3.11) extends to submodules M,N C 
in the following way. Say M = (fi, . . . , f^), N = (gi, . . . , gs) where 
now the f^, gj G R^. In R^^, consider the vectors Voi, . . . , vom? where voi 
is formed by concatenating two copies of the standard basis vector e^ G R^ 
to make a vector in Then take Vi, . . . , v^, where V{ is formed by 

appending m zeros after the components of fi, and . . . , Vt+s, where 
is formed by appending m zeroes before the components of gj. Show 
that the statement of Proposition (3.11) goes over to this setting in the 
following way: (hoi, . . . , hom)^ G M H N if and only if the hoi, . . . , hom 
appear as the first m components in a syzygy in the module 

Syz(voi, . . . , Vom, Vi, . . . , Vt, Vt+I, . . . , Vs+t) 

in 



Exercise 6. 

a. Using Propositions (3.8) and (3.11) and the previous exercise, develop 
an algorithm for computing a set of generators for M C\ N. 

b. In the ideal case (m = 1), show that if a POT extension of a monomial 
order > on i? is used and ^ is a Grobner basis for the syzygy module 
S from Proposition (3.8), then the first components of the vectors in Q 
give a Grobner basis for / fl J with respect to >. 

Macaulay has a built-in command syz for computing the module of syzy- 
gies on the columns of a matrix using the method developed here. For 
instance, with the matrix 

, , _ / — 2bcd a — b\ 

\c^ — (f 6^ -f- acd c-^ d J ^ 

from the examples in §2, we could use: syz M -1 MS to compute syzygies. 
Note: this produces a set of generators for the syzygies on the columns of the 
original matrix M, not the Grobner basis for the module they generate. The 
—1 in the command indicates that we want the syzygies on all the columns 
of the matrix; it is also possible to consider subsets of the columns. Try it! 
Your output should be: 

% syz M -1 MS 
1.2. 3. 4. 5. 6. [126k] . 
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computation complete after degree 6 

y, putmat MS 
3 
1 

-ab3+b4+a3c+a3d“a2cd+abcd-2bc2d-2bcd2 

-a2c-b2c+ac2-bc2-a2d“b2d-ad2+bd2 

a2b3+b5~a3c2+a3cd+ab2cd+2bc3d+a3d2“2bcd3 

Exercise 7. Show that the syzygy computed by Macaulay is valid over 
fc = Q in this case even though the computation was done over Z/ {31991). 

One can also use Singular (or CoCoa or CALI) to compute modules of 
syzygies. In fact, since Singular works with A; = Q, we can use it to do 
Exercise 7. Defining R and M as in Exercise 9 of §2, we enter syz(M); at 
the prompt >. Depending on how wide your screen size is set, your output 
will be: 

>syz(M) ; 

. [I]=a2b3*gen(3)+b5*gen(3)a3c2*gen(3)+a3cd*gen(3)+ab2cd* 
gen(3)+2bc3d*gen(3)+a3d2*gen(3)-2bcd3*gen(3)-ab3*gen(l)+ 
b4*gen ( 1 ) +a3c*gen ( 1 ) +a3d*gen ( 1 ) --a2cd*gen ( 1 ) +abcd*gen (1)- 
2bc2d*gen(l)--2bcd2*gen(l)“a2c*gen(2)-b2c*gen(2)+ac2*gen( 
2)-bc2*gen(2)-a2d*gen(2)-b2d*gen(2)-ad2*gen(2)+bd2*gen(2 
) 

Note that Singular uses the notation gen(l), gen(2), ... to refer to 

the module elements ei, C 2 , There are a range of options for formatting 

output. To get output in a format closer to that given by Macaulay above, 
change the ordering on the module to POT, bottom-down. That is, define 
the ring R using the command 

ring R = 0, (a, b, c, d), (c, dp); 

Try it. 



Additional Exercises for §3 

Exercise 8. Complete the proof of Proposition (3.11) by showing that 
every element of the intersection I n J appears as the first component ho 
in some syzygy in 

S = Syz(vo,vi,...,Vt,Vt4-i,...,V54.t). 

Exercise 9. Let I = {F) = {xz — y, 2/^ + yz + 2x) in fc[r, y, z]. 
a. Find the monic reduced lex Grobner basis G = (fl'i, • • • , fi's) for / and 
the “change of basis” matrices A,B such that G = FA^ F = GB. 




§4. Modules over Local Rings 221 



b. Find a set of generators for Syz(G) using Theorem (3.3). 

c. Compute a set of generators for Syz(F) using Proposition (3.8). 

Exercise 10. Let (mi, . . . , m^) be any ordered t-tuple of elements of 
and let S = Syz(mi, . . . , m^) C R^. Show that for any 1 < s < t, the 
projection of S onto the first (that is, the top) s components (that is, the 
collection iV of (ai, . . . , a^) G such that ai, . . . , a^ appear as the first 
s elements in some element of S) forms a submodule of R^. Hint: N is not 
the same as Syz(mi, . . . , m^). 

Exercise 11. In this exercise, you use syzygies to compute the ideal quo- 
tient I : J. Recall from part b of Exercise 13 from Chapter 1, §1 of this book 
that if / n (h) = (^ 1 , . . . , 9t), then I:{h) = {gi/h, . . . , gt/h). 

a. Using Proposition (3.11) (not elimination), give an algorithm for 
computing I:{h). Explain how the gi/h can be computed without 
factoring. 

b. Now generalize part a to compute I: J for any ideals /, J. Hint: If J = 
(hi, ... , hs), then by [CLO], Chapter 4, §4, Proposition 10, 

s 

Exercise 12. Show that a homogeneous syzygy (cix"~“S . . . , 
on a collection of monomials , • • • , x^^ in R can be written as a sum 
of homogeneous syzygies between pairs of the x^\ (See the proof of 
Proposition (2.3) part c.) 

Exercise 13. If /i ,/2 ^ R, use unique factorization to show that 
Syz(/i, / 2 ) is generated by a single element. Compare with Exercise 4. 

Exercise 14. 

a. Show that the notion of equivalence defined after Proposition (3.10) is 
an equivalence relation on i2-modules. (That is, show that it is reflexive, 
symmetric and transitive.) 

b. Suppose that M and M' are two ii-modules which are equivalent in the 
sense described after Proposition (3.10). That is, there are free modules 
L and L' such that M 0 L is isomorphic to M' 0 L'. Show that any two 
modules of syzygies on M and M' are equivalent. 

Exercise 15. Re-do Exercise 27, parts a and b, from §1 using Proposi- 
tion (3.8). In fact, write out and prove Proposition (3.8) in the special 
case that F = (/i, . . . , /*) is a ordered set of elements of R such that 
1 ^ (/i> • • • J /i) (in which case the Grobner basis G consists of the single 
element {!}). 
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§4 Modules over Local Rings 

The last two sections have dealt with modules over polynomial rings. In 
this section, we consider modules over local rings. It turns out that the 
adaptation of Grobner basis techniques to local rings outlined in Chapter 4, 
extends without difficulty to the case of modules over local rings. Moreover, 
as we shall see, modules over local rings are simpler in many ways than 
modules over a polynomial ring. 

As in the preceding sections, R will denote the polynomial ring 
k[xi ^ . . . , Xn] and we shall let Q denote any one of the local rings obtained 
from R considered in Chapter 4. More precisely, corresponding to any point 
p = (pi, . . . ,Pn) of affine n-space we obtain the localization Rp of R, 

Rp = {f/g : /, 3 e i? and g{jp) ^ 0} 

= {rational functions defined at p.} 

If fc = R or C, we can also consider the ring of convergent power series at 
p, denoted k{x\ — Pi, . . . , — Pn}j and for general fc, we can study the 

ring of formal power series at p, denoted 

fe[[a;i -Pn\\- 

The notation Q will refer to any of these. By the local ring at the point 
p, we will mean Rp. Whenever convenient, we take the point p to be the 
origin 0 G fc”^ in which case Rp = Rq = k[xi , . . . , Xn]{xi,...,xn)- 

In Chapter 4, we restricted ourselves to ideals in Q generated by polyno- 
mials. We make the analogous restriction in the case of modules. That is, 
we shall only consider Q-modules which are either submodules of which 
can be generated by polynomials (that is by vectors all of whose entries 
are polynomials) or modules that have a presentation matrix all of whose 
entries are in R. 

Exercise 1. If Q = k[xi , . . . ,Xn]{xi,...,xn)^ show that any submodule of 
can be generated by generators which are finite fc-linear combinations 
of monomials. 

Given any i?-module M and any point p e k^^ there is a natural Rp- 
module, denoted Mp and called the localization of M at p, obtained by 
allowing the elements of M to be multiplied by elements of Rp. If M is an 
ideal I in i?, then Mp is just the ideal IRp. If M C R^ is generated by 
vectors fi, . . . , f^, then Mp is generated by fi, . . . , f^, where the entries in 
the vectors are considered as rational functions and one allows multipli- 
cation by elements of Rp. If M is presented by the m x n matrix A, then 
Mp is also presented by A. We leave the proof as an exercise. 



Exercise 2. Let M be an /2-module, and A a presentation matrix for M. 
If p e k^ is any point, show that A is a presentation matrix for the Rp- 
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module Mp. Hint: The columns of A continue to be syzygies over i?p, so 
one only needs to observe that any i?p-linear relation on the generators is 
a multiple of an ii-linear relation on the generators. 



It is worth noting, however, that even though the presentation matrix 
A of M is also a presentation matrix for Mp, the matrix A may simplify 
much more drastically over Rp than over R. For example, let R = k[x,y] 
and consider the matrix 



A = 



X \ 

1 + 2 / \ ■ 

xy 0 / 



This does not simplify substantially over R under the rules of Exercise 29 
from §1. However, over we can divide the second row by 1 + t/, which 
is a unit in i?o, and use the resulting 1 in the first column, second row, to 
clear out all other elements in the first column. We obtain the matrix on 
the left which reduces further as shown 




Thus, the matrix A presents an i?o-module isomorphic to the ideal a; + 
xy - j/2). 



Exercise 3. Let ^4 be as above. 

a. Consider the ii-module M presented by the matrix A, Prove that M is 

isomorphic to the ideal (y^, — y^ + x + xy). 

b. Show that in Rq the ideal (y^, yx^, — y^ x xy) is equal to the ideal 

(y^,x + xy - y^}. 

To extend the algorithmic methods outlined in Chapter 4 to submodules 
of one first extends local orders on Q to orders on Q^. Just as for well 
orderings, there are many ways to extend a given local order. In particular, 
given a local order on Q one has both TOP and POT extensions to Q^. The 
local division algorithm (Theorem (3.10) of Chapter 4) extends to elements 
of k[xi , . . . , in exactly the same way as the ordinary division 

algorithm extends ito k[xi , . . . , Xn]^- One has to give up the determinate 
remainder, and the proof of termination is delicate, but exactly the same 
as the proof of the local division algorithm. One defines Grobner (or stan- 
dard) bases, and 5-vectors exactly as in the polynomial case and checks 
that Buchberger’s criterion continues to hold: that is, a set {fi, . . . , ft} of 
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vectors in is a standard basis exactly when each 5-vector on any pair 
^ has remainder 0 when divided by {fi, . . . ,ft}, the division 

being done using the extended local division algorithm. This immediately 
provides an algorithm for extending a set of polynomial vectors in to 
a standard basis of polynomial vectors (provided only that one can show 
termination after a finite number of steps, which follows exactly as in the 
case of Mora’s algorithm for elements of Q). This algorithm is often called 
Mora’s algorithm for modules, and is implemented on the computer algebra 
programs CALI and Singular. 

Once we have a method for getting standard bases, we immediately get an 
algorithm for for determining whether an element belongs to a submodule 
of generated by polynomial vectors. Likewise, everything we said about 
syzygies in the last section continues to hold for modules over local rings. 
In particular, a set of generators {fi, . . . , fg} for a submodule of is a 
standard basis precisely when every syzygy on the leading terms of the 
lifts to a syzygy on the f*, Schreyer’s Theorem for computation of syzygies 
given a standard basis carries over word for word, and the analogues of 
Proposition (3.8) and Corollary (3.9) continue to hold without change. 
Thus, we can compute syzygies on any set of polynomial vectors in Q^. 

In the rest of this section, we shall detail a number of ways in which mod- 
ules over local rings are different, and better behaved, than modules over a 
polynomial rings. This is important, because one can often establish facts 
about modules over polynomial rings, by establishing the corresponding 
facts for their localizations. 



Minimal generating sets 

Given a finitely generated module M over a ring, define the minimal number 
of generators of the module M, often denoted )u(M), to be the smallest 
number of elements in any generating set of M. If the module M is free, 
one can show that any basis has /x(M) elements (in particular, all bases 
have the same number of elements). However, if M is not free (or if you 
don’t know whether it is free), it can be quite difficult to compute p{M). 
The reason is that an arbitrary set of generators for M will not, in general, 
contain a subset of /i(M) elements that generate. In fact, one can easily 
find examples of sets of generators which are unshortenable in the sense 
that no proper subset of them generates. 

Exercise 4. Let R be the ring k[x,y] and let M be the ideal generated by 
{xy{y - l),xy{x - l),x{y - l)(a; - 1)}. 

a. Show that this set is unshortenable. Hint: The least inspired way of 
doing this is to compute Grobner bases, which can be done by hand. 
A more elegant way is to argue geometrically. Each of the generators 
defines a union of three lines and the variety corresponding to M is the 
intersection of the three sets of three lines. 
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b. Show that M — {xy^ — - x). 

We should also mention that Exercise lOe of §1 gives an example of a 
free module with an unshortenable set of /i(M) + 1 generators. 

For modules M over a local ring Q, however, this problem does not arise. 
Unshortenable sets of generators are minimal, and any set of generators 
contains an unshortenable set. 

Exercise 5. Let R = k[x, y] and M be as in Exercise 4. Let Mq be the 
ideal in Rq obtained by localizing at the origin. 

a. Since {xy{y — l),xy{x — l),x{y — l)(a; — 1)} generates M in R, it 
generates Mq in Rq. Show that this set of generators is shortenable. 
What is the shortest unshortenable subset of it that generates Mq? 

b. Answer the same questions for the set — x^y, — x). 

c. With the notation of Exercise 10 of §1, let N be the i?-module generated 
by hi, h2, hs C R^ and Nq C {Rq)^ the /^-module they generate. Find 
an unshortenable subset of {hi, h2, h2 that generates Nq. 

Moreover, it turns out to be easy to compute /x(M) when M is a module 
over a local ring Q. The reason is the following extremely simple, and ex- 
tremely useful, result and its corollaries which hold for all finitely-generated 
modules over a local ring. 

(4.1) Lemma (Nakayama’s Lemma). Let Q be a local ring with max- 
imal ideal m, and let M be a finitely generated Q-module. If mM = M, 
then M = 0. 

Proof. Suppose that M ^ 0, and let /i,...,/s be a minimal set of 
generators of M. Then fs E mM. Thus, fs = ui/i + . . . + «s/s for some 
ai . . . , Us G m. Hence, 

(1 — as)fs = Ui/i + . . . + as-ifs-i- 

But 1 — fls is a unit because as G m, so we have that fs is a Q-linear com- 
bination of /i, . . . , /s_i. This contradicts the minimality of the generating 
set. □ 

As a corollary, we obtain the following (equivalent) statement. 

(4.2) Corollary. Let Q be a local ring with maximal ideal m, let M be 
a finitely generated Q-module, and let N be a submodule of M. If M = 
mM + N, then M — N, 

Proof. Note that m(M/iV) = (mM + N)/N. Now apply Nakayama’s 
lemma to M/N. □ 

Recall from Exercise 23 of §1 of this chapter that if R is any ring, I 
any ideal of R, and M an i?-module, then M/IM is an /2/J-module. If, in 
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addition, / is a maximal ideal, then R/I is a field, so MfIM is a vector 
space over R/I (any module over a field fc is a fc- vector space). If M is 
finitely generated, then MjIM is finite-dimensional. In fact, if /i, . . • , /« 
generate M as an i?-module, then the residue classes [/i], . . . , [fs] in 
M/IM span M/IM as a vector space. If i? is a local ring Q with maxi- 
mal ideal m, then the converse is true: if [/i], . . . , [fs] span M/mM, then 
M = {/i, . . . , +mM and Corollary (4.2) to Nakayama’s lemma implies 
that M = (/i, . • . , /s). In fact, we can say more. 

(4.3) Proposition. Let Q, m fee o local ring, k = Q/m its residue field 
(the underlying field of constants), and M any finitely generated Q -module. 

a. fi, . . . , fs is a minimal generating set of M if and only if the cosets 
[/i]> • • • j [fs] form a basis of the k-vector space M/mM. 

b. Any generating set of M contains a minimal generating set. Any 
unshortenable set of generators of M is a minimal set of generators. 

c. One can extend the set fi, ..., ft to a minimal generating set of M, if 
and only if the cosets [/i], . . . , [ft] are linearly independent over k. 

Proof. The first statement follows from the discussion preceding the 
proposition. The second two statements follow as in linear algebra and 
are left as exercises. □ 



An example may make this clearer. Suppose that Q - 
M = (fi, £ 2 ) C k[[xj y]]^ be the Q-module generated by 



k[[x, y]] and let 



x‘^ y^ xy' 



y^ + 



Then, mM = {x, y)M is generated by xfi, yfi, a;f 2 , yl 2 - Anything in M is of 
the form p{x, y)f\ ^-q{x, y)t 2 where p, q are formal power series. Since we can 
always write p{x, y) = p(0, 0) -h xp\ (x, y) -h yp 2 {x, y) for some (non-unique) 
choice of power series pi,P 2 and, similarly, q(x, y) = g(0, 0) -f xq\{x, y) + 
yq 2 {x, y) we see that p{x, y)Ii+q{x, y)f 2 is congruent to p(0, 0)fi -fg(0, 0 )f 2 
modulo (xfi, yfi, xf 2 , 2 /f 2 ). The latter is a fc-linear combination of [fi] and 
[£ 2 ]. (The reader can also check that [£i] and [£ 2 ] are fc-linearly independent.) 

If M is a module over a local ring, then Proposition (4.3) gives a method 
to determine /i(M) in principle. One might ask, however, if there is a way 
to determine /x(M) from a presentation of M. Can one, perhaps, find a 
presentation matrix of M/mM. There is a very satisfying answer to this. 
First we need little lemma, which applies to any ring. 



(4.4) Lemma. Let P be any ring (e.g. R, Q, R/J, . . .), let I be an ideal 
in P, let M be any finitely generated P-module, and let A be a presentation 
matrix for M. If we let A denote the matrix obtained from A by interpreting 
each entry in A as its residue class modulo I, then A presents M/IM. 
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Proof. To say an m x s matrix A presents M is to say that there are 
generators /i, . . . , /m of M and that if ai/i + • • • + amfm = 0 with 
ai, . . . , ^ ^ is any relation, then the column vector (ai, . . . , am)^ 

is a P-linear combination of the columns of M. It is clear that the 
images [/i], • • • , [/m] generate M/IM. So we need only show that the 
columns of A span the set of all syzygies on the [fi\. So, suppose that 

[’"i][/i] H ^ [^m][/m] = 0 in P/7 (here n e P and [n] is the coset n + / 

it represents). Then r\fi -h • • • + Vm/m ^ Thus, 

^l/l + • • • + f'mfm = bifi + • • • + bmfm 

for some bi e /, whence 

in - bi)fi H + (r^ - bm)fm = 0. 

By assumption, (ri — 6i, . . . , — bm)^ is a P-linear combination of the 

columns of A. Hence ([ri — 6i], . . . , [r^ — bm])^ is a P/I linear combination 
of the columns of A. But [ri — bi] = [vi] because bi G /, for alH = 1, . . . , m. 
Thus the columns of A generate all syzygies on [/i], . . . , [/m]) and this 
completes the proof. □ 

And, now, for the result we alluded to above. 

(4.5) Proposition. Let M be an R-module, R = fc[xi, . . . , Xn], and sup- 
pose that A is a matrix presenting M. If p G is any point in affine 
n-space, let A{p) be the matrix obtained by evaluating all the entries of 
A (which are polynomials Oij G R) at p. Then A(p) presents Mp/nXpMp, 
where nip is the unique maximal ideal {xi — pi^ ... ,Xn ~ Pn) in Rp- 

Proof. Write A — {aij). Since A presents M, it also presents the Rp- 
module Mp by Exercise 2 above. By Lemma (4.4), [A\ = (oij mod nip) 
presents Mp/xtipMp. But Oij = aij{p) mod nip (exercise!). □ 

(4.6) Corollary. Let M and A be as above. For any p G /^(Afp) = 
m — rk(A(p)), where rk(A(p)) denotes the usual rank of a matrix over a 
field k (that is, the number of linearly independent rows or columns). 

Proof. By Proposition (4.3), p{Mp) = dimMp/nipMp. Suppose that A 
is an m X 5 matrix. We know that A{p) presents Mp/xtipMp. Then, by 
Proposition (1.10), Mp/xtipMp is isomorphic to /A{p)k^ (where A{p)k^ 
is the image of A{p)), and the dimension of the latter is m — rk(A(p)). □ 

Minimal presentations 

As a result of the discussion above, we have have a privileged set of pre- 
sentation matrices of any finitely generated module M over a local ring Q. 
Namely, we choose a minimal set of generators of M. The set of syzygies 
on this set is again a module over the local ring Q, so we choose a minimal 
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generating set for this set of syzygies. As usual, we arrange the syzygies as 
columns to obtain a matrix, which we call a minimal presentation matrix 
of M. We claim that the dimensions of this matrix do not depend on the 
choice of minimal generating set of M, and that any minimal presentation 
matrix can be obtained from another by a change of generators. 

(4.7) Proposition. 

a. Minimal presentation matrices for finitely generated modules M over 
a local ring Q are essentially unique in the following sense. Let F = 
(/i, • • • , /m) o,f^d G = (^i» • • • , ^m) two minimal generating sets for 
M. Let A be anm X s minimal presentation matrix for M with respect 
to F. Similarly, let B be an m x t minimal presentation matrix for M 
with respect to G. Then s = t and B — CAD, where C is the m x m 
change of basis matrix satisfying F = GC, and D is an invertible s x s 
matrix with entries in Q. 

b. If a presentation matrix A for M is a minimal presentation matrix then 
all entries of A belong to the maximal ideal of Q. 

Proof. To prove part a, first note that we have F — GC and G = FC' 
for some mxm matrices with entries in Q. By Proposition (4.3), reducing 
mod m, the matrices C and C" are invertible mxm matrices over k. By 
Corollary (3.9) of this chapter, the columns of CA are in T = Syz(G), 
and by the preceding remarks, the cosets of those columns in T/mT must 
be linearly independent over k. Hence, we must have s < t. Similarly, the 
columns of C'B are in 5 = Syz(F), and the cosets of those columns in 
S/mS must be linearly independent over k. Hence, t < s. It follows that 
s = t, so that by Proposition (4.3) the columns of CA are a minimal 
generating set for Syz(G). Hence, B = CAD for some invertible s x s 
matrix D. 

For part b, we claim that no entry of a minimal presentation matrix 
A can be a unit of Q. Indeed, if the i,j entry were a unit, then fi could 
be expressed in terms of the other /fc, contradicting the assertion that 
{/i, • • • , /m} is minimal. □ 

If we are given an explicit set of generators for a submodule M of 
then the last assertion of the lemma provides an algorithm for computing 
the minimal presentation of M. One prunes the given set of generators so 
that it is minimal, computes a basis of the module of syzygies on the chosen 
set, and discards any syzygies which involve units. 

For example, the minimal presentation matrix of the ideal {x,y,z) C 
k]f[X, y, ^]] is the matrix with Koszul relations as columns 

( y z 0 \ 

A=\—x 0 z \ 

\ 0 -X -y ) 
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We have seen that free modules are the simplest modules. However, it is 
sometimes difficult to actually determine whether a given module is free. 
Given a presentation matrix of a module over a local ring, there is a criterion 
which allows one to determine whether or not the module is free. To do 
this, we introduce a sequence of ideals which are defined in terms of the 
presentation matrix for a module, but which turn out to be independent of 
the presentation. 

Let M be a finitely generated -R-module {R any ring) and let A be a 
presentation matrix for M. Then the ideal of ith minors Ii{A) is the ideal 
generated by the ith minors of A (that is, by the determinants of i x i 
submatrices of A), Here, we define the 0th minor of A to be 1 (so that 
Iq{A) = R). More generally, if i < 0, we define Ii{A) = R. If i exceeds the 
number of rows or number of columns of A, we define the ith minor to be 
0, so that Ii{A) = 0 for sufficiently large i. Although defined in terms of 
a presentation matrix A, the ideals will turn out to yield invariants of the 
module M. 

(4.8) Lemma. Let M be an R-modulej R any ring. If A and B are ma- 
trices that both present M, and that have the same number of rows, then 
Ii{A) = for alii. 

Proof. We leave the proof as an exercise — see Exercise 10. □ 

The restriction that the presentation matrices have the same number of 

rows is irksome, but necessary. The matrices A = (0) and B = 

clearly present the same module (namely, the free module R). Note that 
Io{A) = R, Ii{A) = (0), while Io{B) = R, Ii{B) = R. It turns out to be 
more convenient to change the indexing of the ideals of minors. 

(4.9) Definition. If M is an i?-module presented by A, the ith Fitting 
invariant Fi{M) is defined by setting Fi{M) = 7^_i(A) where A has m 
rows. 

Notice that with this shift in index, the Fitting invariants of the free R- 
module R are Fi{R) = i? for i > 0 and Fi{R) = (0) for i < 0, no matter 
whether we use the matrix A or H above to compute the Fi. 

(4.10) Proposition. The Fitting invariants of a module depend only on 
the module, and not on the presentation. That is, isomorphic modules have 
isomorphic Fitting invariants. 




Proof. This is an immediate corollary of Lemma (4.8) and the definition 
of the Fitting invariants. See Exercise 10. □ 
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For modules over local rings, it is easy to show that necessary and suffi- 
cient conditions for a module to be free can be given in terms of the Fitting 
invariants. 

( 4 . 11 ) Proposition. Let Q be a local ringy M a finitely generated Q- 
module. Then M is free of rank r if and only if Fi{M) = 0 for i < r and 
Fi{M) = R for i > r. 

Proof. By Proposition (4.10), Fi{M) does not depend on the choice of 
matrix A presenting M. If M is free of rank r, then we can take A to 
be the m x 1 matrix all of whose entries are 0. Computing Fi using this 
presentation gives Fi{M) = 0 for i < r and Fi{M) = R for i > r. 

Conversely, suppose that A is some m x s matrix presenting M, and 
suppose that Io{A) = h{A) = • • • = = R and 7m-r+i(^) = 0. 

Since R is local, this means that some (m — r) x (m — r ) minor of A is a unit 
(an i?-linear combination of elements of a local ring which is a unit must 
be such that one of the elements is a unit). This minor is a sum of terms, 
each a product of m — r terms of R. Again because R is local, one such 
summand must be a unit, and, hence, the m — r terms that multiply to give 
it must be units. By exchanging columns and rows of A, we may assume 
that an, 022 , • • • ? o,m-r,m-r are units. By row and colunm operations we 
may arrange that an = 022 = • • • = am-r,m-r = 1 and that all other 
entries in the first m — r rows and first m — r columns are zero. 

We claim that all other entries of A must be zero. To see this, suppose 
that some other entry were nonzero, say f G A, We could arrange that 
am-r+i,m-r+i = / by leaving the first m — r columns and rows fixed, 
and exchanging other rows and columns as necessary. But then the (m — 
r -f 1) X (m — r + 1) minor obtained by taking the determinant of the 
submatrix consisting of the first m — r -h 1 rows and columns would equal 
/ and Im-r+i{^ could not equal zero. 

Since A is m x s, we conclude that A presents a module with m genera- 
tors, the first m — r of which are equal to zero and the last r of which only 
satisfy the trivial relation. This says that M is free of rank r. □ 

Projective modules 

Besides free modules, there is another class of modules over any ring which 
are almost as simple to deal with as free modules. These are the so-called 
projective modules. 

( 4 . 12 ) Definition. If R is any ring, an iZ-module M is said to be projective 
if there is an ii-module N such that M 0 iV is a free module. 

That is, a projective module is a summand of a free module. Such a 
notion arises when dealing with syzygies, as shown by the following exercise 
(compare Exercise 26 of §1). 
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Exercise 6. 

a. Suppose that a module M has generators ^i, . . . , so that the module 
of syzygies Syz (^i, . . . , ps) is free. Then let /i, . . . , be another gener- 
ating set of M. Use Proposition (3.10) of the preceding section to prove 
that Syz (/i, . . . , /t) is projective. 

b. Let {/i, • . . , /t} C R, R any ring, be a set of elements such that such 
that (/i, . . . , /t) = R- Show that Syz(/i, ^ ft) is projective. Hint: 
Use part a. 

Every free module is clearly a projective module, but not conversely. In 
Exercise 26a of §1, we point out that Z/6 = Z/3 0 Z/2, but Z/3 is clearly 
not a free (Z/6)-module. 

Over a local ring, however, it is easy to show that any projective module 
is free. 

(4.13) Theorem. If Q is a local ring^ and M a projective Q-moduley then 
M is free. 

Proof. By assumption, there is a module N such that M0iV = Q^, some 
s. We may harmlessly view M as a submodule of Q^. Choose a minimal 
generating set fi, . . . , of M. If we let m denote the maximal ideal of 
Q, then fi + tuM, . . . , + tnM are a basis of M/mM Since M n N = 

{0}, fi + mM + miV, . . . , fm -h xnM + mN are linearly independent in 
M/(mM + mN) c (M + N)/m{M + N). Therefore, by the second part of 
Proposition (4.3), fi, . . . , extend to a minimal generating set of M 0 iV, 
which is a basis, hence linearly independent over Q. But then, fi, . . . , 
must be linearly independent over Q, and hence a basis of M. Thus, M is 
free. □ 

For a long time, it was an open question as to whether the above result 
continues to hold for polynomial rings over a field. The assertion that such 
is the case (that is, that every projective k[xi, . . . ,a;n]-module is free) is 
known as Serre^s conjecture and was finally proved by Quillen and Suslin 
independently in 1976 (see Theorem (1.8), and Exercises 26 and 27 of §1 
for more information). 

Since modules over local rings are so much simpler than modules over 
a polynomial ring, one often tries to establish results about modules over 
polynomial rings by establishing the result for the localizations of the mod- 
ules at all points. One then hopes that this will be enough to establish the 
result for modules over the polynomial ring. 

We give one example of this here, phrased to make its algorithmic 
importance clear. We learned it from M. Artin’s Algebra [Art]. 

(4.14) Theorem. Let M be a finitely generated module over a polynomial 
ring i? = fc[xi, . . . , Xn] with fc = C and let A be an mxs matrix presenting 
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M. Then M is a free module of rank r if and only if for every p E the 
matrix A{p) has rank m — r (as above^ A{p) is the matrix with entries in 
C obtained by evalutaing the polynomial entries of A at the point p). 

Proof. We prove the easy direction, and make some comments about the 
reverse direction. Suppose that A presents M. Choose a free basis ci, . . . , 
and let A' be the r x 1 matrix of zeros presenting M with respect to this 
basis. It follows from Exercise 33 of §1 that the matrices 

®=(o Z) ^'=('o A') 

are such that rank(D(p)) = rank(D'(p)) for all p G k^. (See Exercise 12) 
However, is a constant matrix of rank m. Thus, D{p) has rank m for 
all p. It follows that A(p) has rank m — r for all p. 

To get the converse, in the exercises we ask you to show that if 
Tank{A{q)) = m — r for all q in some neighborhood of p (we assumed 
that k = C to make sense of this), then Mp is free of rank m — r. We then 
ask you to show that if Mp is free of rank m — r for all p E then M is 
projective of rank m — r. The Quillen-Suslin theorem then implies that M 
is free. □ 

Additional Exercises for §4 
Exercise 7. 

a. Let M be a finitely generated free i2-module. Show that any two bases 
of M have the same number of elements. 

b. Let M be any finitely generated i?-module. Show that the maximal 
number of i?-linearly independent elements in any generating set of M 
is the same. (This number is called the rank of M.) 

Exercise 8. Prove the second and third parts of Proposition (4.3). 

Exercise 9. Suppose that f E R = k[xi ^ . . . , Xn] and p = (pi, . . . , Pn) C 
fc^. Show that tUp = (^i — Pi, • • • , — Pn) is the maximal ideal of Rp. 

Explain why / = f{p) mod nip. (Compare the proof of Proposition (4.5).) 

Exercise 10. Show that the Fitting ideals of M are an ascending sequence 
of ideals which do not depend on the choice of presentation matrix of M 
as follows. 

a. Given a finite generating set /i, . . . , /s for M, let A — (aij) be the 
presentation matrix constructed by choosing one set of generators of 
Syz(/i, ...,/«) and let B = {bij) a presentation matrix constructed by 
choosing another set of syzygies which generate. Show that the Fitting 
ideals constructed from the matrix A are the same as the Fitting ideals 
constructed from the matrix B. Hint: The hypotheses imply that the 




§4. Modules over Local Rings 233 



columns of B can be expressed in terms of the columns of A. It is then 
clear that h{A) D To see that hiA) D h{B) write out the two 

by two minors of B in terms of the entries of A, Generalize to show that 
Ii{A) D Ii{B), Expressing the columns of A in terms of those of B gives 
the reverse containments. 

b. Show that the Fitting ideals do not depend on the choice of generators 
/i, . . . , /s of M. Hint: Compare the ideals generated by the ixi minors 
of a presentation matrix with respect to the generators /i, . . . , and 
those generated by the ixi minors of a presentation matrix with respect 
to the generators /i, . • . , /s, where / is any element of M. 

c. Show that 0 = Fo(M) C Fi{M) C . . . Fs-^i{M) = R where s is as in 
part a. 

Exercise 11. In the ring Z[v/^], show that the ideal (2, 1 + C 

Z[-\/^] is a projective Z[\/^]-module which is not free. 

Exercise 12. Show directly from Exercise 31 of §1 (that is, do not use 
Exercise 33) that the matrices 

0 ““ ■<“) 

in the proof of Theorem (4.14) are such that rank(D(p)) = rank(D'(p)) for 
all p £ k^. Hint: Use the result of Exercise 31, and compare the result of 
multiplying the matrix therein on the left by 

('o 1) - {’s !')■ 

Exercise 13. Suppose that fc = C, that A presents M. Show that Mp 
is free of rank r if and only if rank(A(g)) = m — for all q in some 
neighborhood of p. 

Exercise 14. Let R = k[x \^ . . . ^Xn]- Show that M is a projective R- 
module if and only if Mp is a projective (hence free) i?p-module for all 
p G k'^. 

Exercise 15. Let i? be a ring and let A = ( ai • • • ) be a 

1 X m unimodular matrix (in this situation, unimodular means that R = 
(ai, . . . , ttm))- Also note that ker A is the syzygy module Syz(ai , . . . , CLm)» 
Prove that ker A is a free ii-module if and only if there exists an invertible 
m X m matrix B with coefficients in R whose first row is A. Thus, the 
statement that the kernel of any unimodular row is free is equivalent to the 
statement that any unimodular row with coefficients in fc[xi, . . . , Xn] is the 
first row of a square matrix with polynomial coefficients and determinant 1. 
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Free Resolutions 



In Chapter 5, we saw that to work with an i?-module M, we needed not 
just the generators /i, . . . , /t of M, but the relations they satisfy. Yet 
the set of relations Syz (/i, . . . , /i) is an i2-module in a natural way and, 
hence, to understand it, we need not just its generators Pi, . . . , but the 
set of relations Syz (^i, . . . , on these generators, the so-called second 
syzygies. The second syzygies are again an il-module and to understand it, 
we again need a set of generators and relations, the third syzygies, and so 
on. We obtain a sequence, called a resolution, of generators and relations of 
successive syzygy modules of M. In this chapter, we will study resolutions 
and the information they encode about M, Throughout this chapter, R 
will denote the polynomial ring . . . , Xn] or one of its localizations. 



§1 Presentations and Resolutions of Modules 

Apart from the possible presence of nonzero elements in the module of 
syzygies on a minimal set of generators, one of the important things that 
distinguishes the theory of modules from the theory of vector spaces over 
a field is that many properties of modules are frequently stated in terms of 
homomorphisms and exact sequences. Although this is primarily cultural, 
it is very common and very convenient. In this first section, we introduce 
this language. 

To begin with, we recall the definition of exact. 

(1.1) Definition. Consider a sequence of i?-modules and homomorphisms 

> Mi+i Mi ^ Mi-i • 

a. We say the sequence is exact at Mi if im((/?i 4 -i) = ker(<^i). 

b. The entire sequence is said to be exact if it is exact at each Mi which 
is not at the beginning or the end of the sequence. 



234 
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Many important properties of homomorphisms can be expressed by say- 
ing that a certain sequence is exact. For example, we can phrase what it 
means for an ii-module homomorphism ip : M N tohe onto, injective, 
or an isomorphism: 

• (p : M N is onto (or surjective) if and only if the sequence 

M ^ N -^0 

is exact, where iV -> 0 is the homomorphism sending every element of 
N to 0. To prove this, recall that onto means im((^) = N. Then the 
sequence is exact at N if and only if im((^) = ker(iV — » 0) = AT, as 
claimed. 

• (p : M N is one-to-one (or injective) if and only if the sequence 

0^ M ^ N 

is exact, where 0 — > M is the homomorphism sending 0 to the additive 
identity of M. This is equally easy to prove. 

• p : M N is Sill isomorphism if and only if the sequence 

0 ^ M ^ AT 0 

is exact. This follows from the above since p is an isomorphism if and 
only if it is one-to-one and onto. 

Exact sequences are ubiquitous. Given any ii-module homomorphism or 
any pair of modules, one a submodule of the other, we get an associated 
exact sequence as follows. 

(1.2) Proposition. 

a. For any R-module homomorphism p : M N, we have an exact 
sequence 

0 ker((^) coker{p) — > 0, 

where ker((^) — » M is the inclusion mapping and N — > coker((/?) = 
N/im{p) is the natural homomorphism onto the quotient module, as in 
Exercise 12 from of Chapter 5. 

b. IfQ C P is a submodule of an R-module P, then we have an exact 
sequence 

0 ^ Q ^ P A P/Q -> 0, 

where Q P is the inclusion mapping, and v is the natural 
homomorphism onto the quotient module. 

Proof. Exactness of the sequence in part a at ker{p) follows from the 
above bullets, and exactness at M is the definition of the kernel of a ho- 
momorphism. Similarly, exactness at N comes from the definition of the 
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cokernel of a homomorphism (see Exercise 28 of Chapter 5, §1), and exact- 
ness at coker((^) follows from the above bullets. In the exercises, you will 
show that part b follows from part a. □ 

Choosing elements of an i?-module M are also conveniently described in 
terms of homormorphisms. 

(1.3) Proposition. Let M be an R-module. 

a. Choosing an element of M is equivalent to choosing a homomorphism 
R-^ M. 

b. Choosing t elements of M is equivalent to choosing a homomorphism 
Rf -> M. 

c. Choosing a set of t generators of M is equivalent to choosing a homo- 
morphism Rf ^ M which is onto (i.e., an exact sequence Rf M 

o;. 

d. If M is free, choosing a basis with t elements is equivalent to choosing 
an isomorphism Rl M. 

Proof. To see part a, note that the identity 1 is the distinguished element 
of a ring R. Choosing an element / of a module M is the same as choosing 
the i^-module homomorphism cp : R M which satisfies (p{l) = f. This 
is true since (^( 1 ) determines the values of on all g £ R: 

via) = ¥5(5 • 1) = 5 • '^(1) = 9f- 

Thus, choosing t elements in M can be thought of as choosing t /^-module 
homomorphisms from R to M or, equivalently, as choosing an /^-module 
homomorphism from R* to M. This proves part b. More explicitly, if we 
think of Rf as the space of column vectors and denote the standard basis in 
R^ by ei, 62 , . . . , Ct, then choosing t elements /i, . . . , /t of M corresponds 
to choosing the ii-module homomorphism : Rf M defined by set- 
ting (p{ci) — fi, for all i = 1, . . . , t. The image of (p is the submodule 
{/ij • • • ? /t) C M. Hence, choosing a set of t generators for M corresponds 
to choosing an R-module homomorphism Rl M which is onto. By our 
previous discussion, this is the same as choosing an exact sequence 

^ M -> 0 . 

This establishes part c, and part d follows immediately. □ 

In the exercises, we will see that we can also phrase what it means to 
be projective in terms of homomorphisms and exact sequences. Even more 
useful for our purposes, will be the interpretation of presentation matrices 
in terms of this language. The following terminology will be useful. 

(1.4) Definition. Let M be an R-module. A presentation for M is a set 
of generators /i, . . . , /t, together with a set of generators for the syzygy 
module Syz (/i, . . . , /t) of relations among /i, . . . , ft. 
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One obtains a presentation matrix for a module M by arranging the gen- 
erators of Syz (/i, . . . , /t) as columns — being given a presentation matrix 
is essentially equivalent to being given a presentation of M. To reinter- 
pret Definition (1.4) in terms of exact sequences, note that the generators 
/i, . . . , /t give a surjective homomorphism (p : R* M by part c of 
Proposition (1.3), which means an exact sequence 

Rt V, M -^0. 

The map tp sends G R* to X)*=i 9ifi ^ It follows that a 

syzygy on fi, ^ ft is an element of the kernel of (/?, i.e., 

Syx{h,...,ft) = ker{<p:R*-^M). 

By part c of Proposition (1.3), choosing a set of generators for the syzygy 
module corresponds to choosing a homomorphism ^ of onto ker((/?) = 
Syz (/i, . . . , /t). But ^ being onto is equivalent to im(^) = ker(<^), which 
is just the condition for exactness at R^ in the sequence 

(1.5) R^ ^ R^ ^ M -^0. 

This proves that a presentation of M is equivalent to an exact sequence of 
the form (1.5). Also note that the matrix of with respect to the standard 
bases of R^ and R* is a presentation matrix for M. 

We next observe that every finitely generated ii-module has a presenta- 
tion. 

(1.6) Proposition. Let M be a finitely generated R-module. 

a. M has a presentation of the form given by (1.5). 

b. M is a homomorphic image of a free R-module. In fact, if fi, . . . , ft is 
a set of generators of M, then M = R)/S where S is the submodule of 
R) given by S = Syz(/i , ..., ft). Alternatively, if we let the matrix A 
represent -0 in (1.5), then AR^ = im(-0) and M = R)/AR^. 

Proof. Let /i, . . . , /t be a finite generating set of M. Part a follows from 
the fact noted in Chapter 5, §2 that every submodule of R), in particular 
Syz (/i, . . . , /t) C is finitely generated. Hence we can choose a finite 
generating set for the syzygy module, which gives the exact sequence (1.5) 
as above. 

Part b follows from part a and Proposition 1.10 of Chapter 5, §1. □ 

Here is a simple example. Let I = — x, xy, y‘^ — y) in R = k[x, y]. 

In geometric terms, I is the ideal of the variety V = {(0, 0), (1, 0), (0, 1)} 
in We claim that I has a presentation given by the following exact 
sequence: 



(1.7) 
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where cp is the homomorphism defined by the 1x3 matrix 
A = {x^ - X xy -y) 
and 'ijj is defined by the 3x2 matrix 

/ y 0 

B z= I -X 1 y - 1 

\ 0 -X 

The following exercise gives one proof that (1.7) is a presentation of L 

Exercise 1. Let S denote Syz(a;^ — x, xy^ — y). 

a. Verify that the matrix product AB equals the 1x2 zero matrix, and ex- 
plain why this shows that im('0) (the module generated by the columns 
of the matrix jB) is contained in S. 

b. To show that S is generated by the columns of B, we can use Schreyer’s 
Theorem — Theorem (3.3) from Chapter 5 of this book. Check that the 
generators for / form a lex Grobner basis for /. 

c. Compute the syzygies Si2, S13, S23 obtained from the 5-polynomials on 
the generators of I. By Schreyer’s Theorem, they generate S. 

d. Explain how we could obtain a different presentation 

ij3 ^ ^3 ^ j 0 

of I using this computation, and find an explicit 3x3 matrix 
representation of the homomorphism 

e. How do the columns of B relate to the generators S12, S13, S23 of 5? 
Why does B have only two columns? Hint: Show that S13 G (812,823) 
in R^. 

We have seen that specifying any module requires knowing both gener- 
ators and the relations between the generators. However, in presenting a 
module M, we insisted only on having a set of generators for the module of 
syzygies. Shouldn’t we have demanded a set of relations on the generators 
of the syzygy module? These are the so-called second syzygies. 

For example, in the presentation from part d of Exercise 1, there is a 
relation between the generators of Syz(x^ — x, xy, y^ — y), namely 

(1.8) y8i2 - 813 + XS23 = 0 , 

so (y, — 1, xY G R? would be a second syzygy. 

Likewise, we would like to know not just a generating set for the second 
syzygies, but the relations among those generators (the third syzygies), and 
so on. As you might imagine, the connection between a module, its first 
syzygies, its second syzygies, and so forth can also be phrased in terms of 
an exact sequence of modules and homomorphisms. The idea is simple — we 
just iterate the construction of the exact sequence giving a presentation. For 
instance, starting from the sequence (1.6) corresponding to a presentation 
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for M, if we want to know the second syzygies as well, we need another 
step in the sequence: 

ij'- 4 i?* ^ i?* ^ M ^ 0, 

where now the image of X : RJ' W is equal to the kernel of (the 
second syzygy module). Continuing in the same way to the third and higher 
syzygies, we produce longer and longer exact sequences. We wind up with 
a free resolution of M. The precise definition is as follows. 

(1.9) Definition. Let M be an i?-module. A free resolution of M is an 
exact sequence of the form 

, ^2 ^ J?1 4 Fo ^ M 0, 

where for all i, Fi = is a free Jf?-module. If there is a £ such that 
Fi^i = Fi ^2 = • • • = 0, but Fi ^ 0, then we say the resolution is finite^ 
of length £, In a finite resolution of length £, we will usually write the 
resolution as 



0 — > Fi — > Fi—\ — > • • • — > F\ — > Fq — > M — > 0. 

For an example, consider the presentation (1.7) for 
I = {x^ - X, xy, - y) 

in i? = k[x, y]. If 

Oi e R, is any syzygy on the columns of B with ai G i?, then looking 
at the first components, we see that ya\ = 0, so a\ = 0. Similarly from 
the third components 02 = 0. Hence the kernel of in (1.7) is the zero 
submodule. An equivalent way to say this is that the columns of B are a 
basis for Syz(a;^ — x, xy, — y), so the first syzygy module is a free module. 
As a result, (1.7) extends to an exact sequence: 

( 1 . 10 ) 0 ^ ^ I -*0. 

According to Definition (1.9), this is a free resolution of length 1 for /. 

Exercise 2. Show that I also has a free resolution of length 2 obtained 
by extending the presentation given in part d of Exercise 1 above: 

(1.11) 0 4 4 4 / -> 0, 

where the homomorphism A comes from syzygy given in (1.8). 
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Generalizing the observation about the matrix B above, we have the 
following characterization of finite resolutions. 

(1.12) Proposition. In a finite free resolution 

0 Ff_2 -> >Fo^ 0, 

keT{(pi-i) is a free module. Conversely, if M has a free resolution in which 
ker((/?^_i) is a free module for some £, then M has a finite free resolution 
of length £. 

Proof. If we have a finite resolution of length £, then (fi is one-to-one by 
exactness at so its image is isomorphic to a free module. Also, ex- 
actness at F£-i implies ker((^^_i) = im(<p^), so ker((^^_i) is a free module. 
Conversely, if ker((/?^_i) is a free module, then the partial resolution 

Fe-i Fe-2 ^ ^ Fo ^ M ^ 0 

can be completed to a finite resolution of length £ 

0 — ^ Fj? — > Fjf—i ^ — > Fi—2 — ^ • * • — ^ Fq ^ Af — > 0, 

by taking F^ to be the free module ker((^^_i) and letting the arrow Fe — > 
F^_i be the inclusion mapping. □ 

Both (1.11) and the more economical resolution (1.10) came from the 
computation of the syzygies s^j on the Grobner basis for I. By Schreyer’s 
Theorem again, the same process can be applied to produce a free resolu- 
tion of any submodule M of a free module over F. If ^ = {^i, . . . , is 
a Grobner basis for M with respect to any monomial order, then the Sij 
are a Grobner basis for the first syzygy module (with respect to the >g 
order from Theorem (3.3) of Chapter 5). Since this is true, we can iterate 
the process and produce Grobner bases for the modules of second, third, 
and all higher syzygies. In other words, Schreyer’s Theorem forms the basis 
for an algorithm for computing any finite number of terms in a free resolu- 
tion. This algorithm is implemented in Singular, in CoCoA, in the CALI 
package for REDUCE, and in the res command of Macaulay. 

For example, consider the homogeneous ideal 

M — {z^ — yu}^, yz — xw, y^ — x^z, xz^ — y^w) 

in k[x, y, z, w]. This is the ideal of a rational quartic curve in P^. Here is a 
Macaulay session calculating and displaying a free resolution for M: 

y. res M MR 
1.2.3. . .4. . .5. . . 

computation complete after degree 5 
% pres MR 
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z3~yw2 yz-xw y3~x2z xz2~y2w 



0 -X 0 -y 
xz yw y2 z2 
-w 0 -z 0 
-y z “X w 



-y 

w 

X 



The output shows the matrices in a finite free resolution of the form 

(1.13) 0 -> i?'* ^ M ^ 0, 

from the “front” of the resolution “back.” In particular, the first matrix 
(1x4) gives the generators of M, the columns of the second matrix give gen- 
erators for the first syzygies, and the third matrix (4x1) gives a generator 
for the second syzygy module, which is free. 

Exercise 3. 

a. Verify by hand that at each step in the sequence (1.13), the image of the 
mapping “coming in” is contained in the kernel of the mapping “going 
out.” 

b. Verify that the generators of M form a Grobner basis of M for the grevlex 
order with x > y > z > w, and compute the first syzygy module using 
Schreyer’s theorem. Why is the first syzygy module generated by just 4 
elements (the columns of the 4x4 matrix), and not 6 = ( 2 ) elements 
Sij as one might expect? 

The programs Singular and CALI can be used to compute resolutions of 
ideals whose generators are not homogeneous (and, more generally, modules 
which are not graded), as well as resolutions of modules over local rings. 
Here, for example, is a Singular session computing a resolution of the ideal 

(1.14) I = {z^ - y, yz - X,y^ - X^Z, XZ^ - 
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in k[x, y, z] (note that I is obtained by dehomogenizing the the generators 
of M above). 

>ringr=0, (x,y,z), dp; 

> ideal I=(z3-y,yz-x,y3--x2z,xz2“y2) ; 

> res(I,0) ; 

[1]: 

_[l]=z3-y 
_[2] =yz~x 
_ [3] =y3-x2z 
_ [4] =xz2-y2 

[ 2 ]: 

_ [1] =x*gen ( 1 ) -y*gen (2) ~z*gen (4) 

_ [2] =z2 *gen ( 2 ) -y *gen ( 1 ) + 1 *gen ( 4 ) 

_ [3] =xz*gen (2) -y*gen (4) - 1 *gen (3) 

[3]: 

_[1]=0 

The first line of the input specifies that the characteristic of the field is 
0, the ring variables are a:, y, 2, and the monomial order is graded reverse 
lex. The argument “0” in the res command says that the resolution should 
have as many steps as variables (the reason for this choice will become 
clear in the next section). Here, again, the output is a set of columns that 
generate (gen(l), gen(2) , gen(3) , gen(4) refer to the standard basis 
columns ei, C2, 63, 64 of k[x, y, z]^). 

See the exercises below for some additional examples. Of course, this 
raises the question whether finite resolutions always exist. Are we in a 
situation of potential infinite regress or does this process always stop even- 
tually, as in the examples above? See Exercise 11 below for an example 
where the answer is no, but where R is not a polynomial ring. We shall 
return to this question in the next section. 



Additional Exercises for §1 
Exercise 4. 

a. Prove the second bullet, which asserts that (j) : M N is one-to-one if 
and only if 0 M iV is exact. 

b. Explain how part b of Proposition (1.2) follows from part a. 

Exercise 5. Let Mi, M2 be R-submodules of an i?-module N. Let Mi 0 
M2 be the direct sum as in Exercise 4 of Chapter 5, §1, and let Mi + M2 C 
N be the sum as in Exercise 14 of Chapter 5, §1. 

a. Let € : Mi flM2 -> Mi 0M2 be the mapping defined by e(m) = (m, m). 
Show that € is an i?-module homomorphism. 
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b. Show that 5 : Mi © M2 — > Mi + M2 defined by < 5 (mi, m2) = mi — m2 
is an R-module homomorphism. 

c. Show that 



0 — ^ A/i n AI2 — ^ -M^i © A/2 — ^ Afi -f- A/2 — ^ 0 
is an exact sequence. 

Exercise 6 . Let Mi and M2 be submodules of an iZ-module N. 

a. Show that the mappings xj)i : Mi M\ + M2 {i = 1 , 2 ) defined by 
^i(mi) = mi + 0 G Ml + M2 and ^2(^2) = 0 + m2 G Mi + M2 are 
one-to-one module homomorphisms. Hence Mi and M2 are submodules 
of Ml -j- M2^ 

b. Consider the homomorphism (p : M2 — > (Mi + M2) /Mi obtained by 
composing the inclusion M2 — > Mi + M2 and the natural homomor- 
phism Ml + M2 (Ml + M2) /Ml. Identify the kernel of p, and 
deduce that there is an isomorphism of R- modules (Mi -f M2) /Mi = 
M2 /(Mi n M2). 



Exercise 7 . 

a. Let 



0 M„ ^ M„_1 Mn-2 ■■■ ^ Mo -> 0 

be a “long” exact sequence of ii-modules and homomorphisms. Show 
that there are “short” exact sequences 

0 ^ ker((/?i) -> Mi -> ker((^i_i) ^ 0 

for each i = 1 , . . . , n, where the arrow Mi — > ker((/?i_i) is given by the 
homomorphism pi. 
b. Conversely, given 



0 ker((/?i) — > Mi Ni —> 0 

where Ni = ker((^i_i) C Mi_i, show that these short exact sequences 
can be spliced together into a long exact sequence 

0 ker(ip„_i) -> M„_i Mn -2 Mi^ im(¥ 5 i) -*■ 0 . 

c. Explain how a resolution of a module is obtained by splicing together 
presentations of successive syzygy modules. 

Exercise 8 . Let Vi, z = 0 , . . . , n be finite dimensional vector spaces over 
a field fc, and let 

O^Vn^ Vn-l Vn-2 ^ Fq -> 0 
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be an exact sequence of fc-linear mappings. Show that the alternating sum 
of the dimensions of the Vi satisfies: 

=0. 

£=0 

Hint: Use Exercise 7 and the dimension theorem for a linear mapping ip : 
V ^W: 



dimfc(U) = dimfc(ker((^)) + dimfc(im((/?)). 

Exercise 9. Let 

0 — ^ F£ — ^ * • * — ^ F 2 — ^ F\ — ^ Fq — ^ ^ 0 

be a finite free resolution of a submodule M C Show how to obtain 
a finite free resolution of the quotient module K^/M from the resolution 
for M. Hint: There is an exact sequence 0 — > M — > — > RP' jM — > 0 by 

Proposition (1.2). Use the idea of Exercise 7 part b to splice together the 
two sequences. 

Exercise 10. For each of the following modules, find a free resolution 
either by hand or by using a computer algebra system, 

a. M = {xy,xz,yz) C k[x^y^z]. 
h. M = {xy — uv^ xz — uv, yz — uv) C k[x, y, z, u, v]. 

c. M = {xy — XV, xz — yv, yz — xu) C k[x, y, z, u, v]. 

d. M the module generated by the columns of the matrix 

, , _ / — 2bcd a — b\ 

— d^ b^ acd c-\- d J 

in k[a, b, c, d]^. 

e. M = {x‘^,y^,z^,xy,xz,yz) C k[x,y,z]. 
t M - {x^,y^,x^y,xy^) C k[x,y,z]. 

Exercise 11. If we work over other rings R besides polynomial rings, then 
it is not difficult to find modules with no finite free resolutions. For example, 
consider R = k[x]/{x‘^), and M — {x) C R. 

a. What is the kernel of the mapping (f : R ^ M given by multiplication 
by X? 

b. Show that 

... 

is an infinite free resolution of M over R, where x denotes multiplication 
by X. 

c. Show that every free resolution of M over R is infinite. Hint: One way 
is to show that any free resolution of M must “contain” the resolution 
from part b in a suitable sense. 
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Exercise 12. We say that an exact sequence of i?-modules 
0 M ^ N P — » 0 

splits if there is a homomorphism (p : P N such that g o (p = id. 

a. Show that the condition that the sequence above splits is equivalent 
to the condition that TV ~ M 0 P such that / becomes the inclusion 
a I— > (a, 0) and g becomes the projection (a, b) i— > b. 

b. Show that the condition that the sequence splits is equivalent to the 
existence of a homomorphism ip : N M such that tp o f = id. Hint: 
use part a. 

c. Show that P is a projective module (that is, a direct summand of a free 
module — see Definition (4.12) of Chapter 5) if and only if every exact 
sequence of the form above splits. 

d. Show that P is projective if and only if given every homomorphism 
/ : P — > Ml and any surjective homomorphism g : M 2 — > Mi, there 
exists a homomorphism h : P M 2 such that f — g o h. 



§2 Hilbert’s Syzygy Theorem 

In §1, we raised the question of whether every P-module has a finite free 
resolution, and we saw in Exercise 11 that the answer is no if R is the finite- 
dimensional algebra R = k[x]/{x^). However, when R = k[xi^ . . . , Xn] the 
situation is much better, and we will consider only polynomial rings in this 
section. The main fact we will establish is the following famous result of 
Hilbert. 

(2.1) Theorem (Hilbert Syzygy Theorem). Let R = fc[a;i, . . . ^Xn]> 
Then every finitely generated R-module has a finite free resolution of length 
at most n. 

A comment is in order. As we saw in the examples in §1, it is not true 
that all finite free resolutions of a given module have the same length. 
The Syzygy Theorem only asserts the existence of some free resolution of 
length < n for every finitely-generated module over the polynomial ring in 
n variables. Also, remember from Definition (1.9) that length < n implies 
that a P-module M has a free resolution of the form 

0 —> P^ —>•••—> Pi —> Po Af, £ < n. 

This has ^ + 1 < n -f 1 free modules, so that the Syzygy Theorem asserts 
the existence of a free resolution with at most n + 1 free modules in it. 

The proof we will present is due to Schreyer. It is based on the follow- 
ing observation about resolutions produced by the Grobner basis method 
described in §1, using Schreyer’s Theorem — Theorem (3.3) of Chapter 5. 
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(2.2) Lemma. Let G be a Grohner basis for a submodule M (Z with re- 
spect to an arbitrary monomial order, and arrange the elements of G to form 
an ordered s-tuple G = {gi, , . . ,gs) so that whenever LT(^i) and hT{gj) 
contain the same standard basis vector Ck and i < j, then LM{gi)/ek >iex 
Lu{gj)/ekf where >iex is the lex order on R with > • • • > Xn- If the vari- 
ables x\, , . . ,Xm do not appear in the leading terms of Gy then x\,..., 

do not appear in the leading terms of the Sij G Syz(G) with respect to the 
order >g used in Theorem (3.3) of Chapter 5. 

Proof of the lemma. By the first step in the proof of Theorem (3.3) 
of Chapter 5, 

(2.3) LT>g(sij) = {jHij lur{^giy)Ei, 

where mij = lcm(LT(pi), LT(^j)), and Ei is the standard basis vector in R^. 
As always, it suffices to consider only the such that Lx(pi) and lt(^^) 
contain the same standard basis vector in Rf, and such that i < j. 
By the hypothesis on the ordering of the components of G, LM{gi)/ek >iex 
Lu{gj)/ek. Since x\, . . . ,Xmdo not appear in the leading terms, this implies 
that we can write 

Lu{gi)/ek = 
hu{9j)/ek = x^+inj, 

where a > b, and Ui, Uj are monomials in R containing only Xm+ 2 i • • • yXn- 
But then \cm{LT{gi),ur{gj)) contains and by (2.3), LT>g(sij) does 

not contain Xi, . . . , Xmy ^m+i- Q 

We are now ready for the proof of Theorem (2.1). 

Proof of the theorem. Since we assume M is finitely generated as 
an i?-module, by (1.5) of this chapter, there is a presentation for M of the 
form 

(2.4) Fi ^ Fo M ^ 0 

corresponding to a choice of a generating set (/i, • • • , /ro) for M, and a 
Grobner basis Go = {giy • • • , fl'n} for Syz(/i, . . . , fro) = im((pi) C Fq = 
RJ'^ with respect to any monomial order on Fq. Order the elements of Go 
as described in Lemma (2.2) to obtain a vector Go, and apply Schreyer’s 
Theorem to compute a Grobner basis Gi for the module Syz(Go) C Fi = 
R^^ (with respect to the >g^ order). We may assume that Gi is reduced. 
By the lemma, at least Xi will be missing from the leading terms of Gi- 
Moreover if the Grobner basis contains r 2 elements, we obtain an exact 
sequence 

Fi^ Fo-^ M -^0 

with F 2 = Rl'^y and im((^ 2 ) = Syz(Gi). Now iterate the process to obtain 
ipi : Fi Fi-i, where im{cpi) = Syz(Gi_i) and Gi C is a Grobner 
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basis for Syz(Gi_i), where each time we order the Grobner basis Qi-i to 
form the vector Gi-i so that the hypothesis of Lemma (2.2) is satisfied. 

Since the number of variables present in the leading terms of the Grobner 
basis elements decreases by at least one at each step, by an easy induction 
argument, after some number £< n of steps, the leading terms of the 
reduced Grobner basis Ge do not contain any of the variables xi, ... ,Xn- 
At this point, we will have extended (2.4) to an exact sequence 

(2.5) F( ^ F(-i -* y Fi ^ Fo M -y 0, 

and the leading terms in G£ will be non-zero constants times standard 
basis vectors from In Exercise 7 below, you will show that this implies 
Syz(G^-i) is a free module, and G£ is a module basis as well as a Grobner 
basis. Hence by Proposition (1.12) we can extend (2.5) to another exact 
sequence by adding a zero at the left, and as a result we have produced a 
free resolution of length £< n for M. □ 

Here are some additional examples illustrating the Syzygy Theorem. In 
the examples we saw in the text in §1, we always found resolutions of 
length strictly less than the number of variables in R. But in some cases, 
the shortest possible resolutions are of length exactly n. 

Exercise 1. Consider the ideal I = {x“^ — x, xy, — y) C fc[x, y\ from 
(1.7) of this chapter, and let M — fc[x, ?/]//, which is also a module over 
R = k[x,y]. Using Exercise 9 from §1, show that M has a free resolution 
of length 2, of the form 

0-^ R^ R^ R^ M ^0. 

In this case, it is also possible using localization (see Chapter 4) to show 
that M has no free resolution of length < 1. See Exercise 8 below for a 
sketch. 

On the other hand, we might ask whether having an especially short finite 
free resolution indicates something special about an ideal or a module. For 
example, if M has a resolution 0 — > ^ M — > 0 of length 0, then M is 

isomorphic to R^ as an i?-module. Hence M is free, and this is certainly a 
special property! From Chapter 5, §1, we know this happens for ideals only 
when M = (/) is principal. Similarly, we can ask what can be said about 
free resolutions of length 1. The next examples indicate a special feature 
of resolutions of length 1 for a certain class of ideals. 

Exercise 2. Let I C fc[x, y, 2 ;, w] denote the ideal of the twisted cubic in 
P^, with the following generators: 

I = (51, 32 , 93 ) = {xz - y^, xw - yz, yw - z^). 
a. Show that the given generators form a grevlex Grobner basis for I. 
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b. Apply Schreyer’s Theorem to find a Grobner basis for the module of 
first syzygies on the given generators for I. 

c. Show that S 12 and S 23 form a basis for Syz{xz — xw — yz, yw — z‘^). 

d. Use the above calculations to produce a finite free resolution of /, of the 
form 

0 -> 4 _ 0 . 

e. Show that the determinants of the 2x2 minors of A are just the gi (up 
to signs). 

Exercise 3. (For this exercise, you will probably want to use a computer 
algebra system.) In consider the points 

Pi = (0,0), P2 = (1,0), P3 = (0,1) 

P4 = (2, 1), P5 = (1, 2), pe = (3, 3), 

and let Ii = l{{pi}) for each z, so for instance I 3 = {x,y — 1). 

a. Find a grevlex Grobner basis for 

J = I({pi, . . . ,Pe}) = /i n • • • n /e. 

b. Compute a free resolution of J of the form 

0 ^ 4 j ^ 0 , 

where each entry of A is of total degree at most 1 in a; and y. 

c. Show that the determinants of the 3x3 minors of A are the generators 
of J (up to signs). 

The examples in Exercises 2 and 3 are instances of the following general 
result, which is a part of the Hilbert-Burch Theorem. 

(2.6) Proposition. Suppose that an ideal I in R = k[xi , . . . , Xn] has a 
free resolution of the form 

0 ^ 4 ii"* 4 / ^ 0 

for some m. Then there exists a nonzero element g E R such that B = 
( gfi . . . gfm ), where fi is the determinant of the (m — 1) x (m — 1) 
submatrix of A obtained by deleting row i. If k is algebraically closed and 
V(/) has dimension n — 2, then we may take g = 1. 

Proof. The proof is outlined in Exercise 10 below. □ 

The full Hilbert-Burch Theorem also gives a sufficient condition for the 
existence of a resolution of the form given in the proposition. For example, 
such a resolution exists when the quotient ring R/I is Cohen- Macaulay of 
codimension 2. This condition is satisfied, for instance, if / C k[x^ y, z] is 
the ideal of a finite subset of (including the case where one or more 
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of the points has multiplicity > 1 as defined in Chapter 4). We will not 
give the precise definition of the Cohen-Macaulay condition here. Instead 
we refer the interested reader to [Eis], where this and many of the other 
known results concerning the shapes of free resolutions for certain classes of 
ideals in polynomial and local rings are discussed. In particular, the length 
of the shortest finite free resolution of an iZ-module M is an important 
invariant called the projective dimension of M. 



Additional Exercises for §2 

Exercise 4. Let I be the ideal in k[x, y] generated by the grevlex Grobner 
basis 

{gi, 92 , + 3/2xy + l/2y^ - 3/2x - 3/2y, xy^ -x,y^ - y} 

This ideal was considered in Chapter 2, §2 (with A; = C), and we saw there 
that V(/) is a finite set containing 5 points in each with multiplicity 1. 

a. Applying Schreyer’s Theorem, show that Syz(^i, 93 ) is generated by 
the columns of the matrix 

/ y ^-1 0 \ 

A = \ -X- 3y/2 + 3/2 y \ 

\ -y/2 + 3/2 -X J 

b. Show that the columns of A form a module basis for Syz(^i, ^2? 93) j and 
deduce that I has a finite free resolution of length 1: 

0 4 _ 0. 

c. Show that the determinants of the 2x2 minors of A are just the gi (up 
to signs). 

Exercise 5. Verify that the resolution from (1.8) of §1 has the form given 
in Proposition (2.6). (In this case too, the module being resolved is the 
ideal of a finite set of points in each appearing with multiplicity 1.) 

Exercise 6. Let 

I = {z^ - y, yz -x,y^ - x^z, xz^ - y^) 
be the ideal in k[x, 2/, z] considered in §1 see (1.16). 

a. Show that the generators of I are a Grobner basis with respect to the 
grevlex order. 

b. The sres command in Singular produces a resolution using Schreyer’s 
algorithm. The Singular session is as follows. 
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>ringr=0, (x,y,z), (dp, C) ; 

> ideal I=(z3-y,yz-x,y3~x2z,xz2-y2) ; 

> sres(I,0) ; 

[1]: 

_[l]=yz-x 
_[2]=z3-y 
_ [3] =xz2-y2 
^[4]=y3-x2z 

[2]: 

_[l]=-z2*gen(l)+y*gen(2)-l*gen(3) 

_ [2] =-xz*gen ( 1 ) +y*gen (3) +1 *gen (4) 

_ [3] =-x*gen (2) +y*gen ( 1 ) +z*gen (3) 

[4] =-y2*gen ( 1 ) +x*gen (3 ) +z*gen (4) 

[3]: 

_ [ 1 ] =x*gen ( 1 ) +y *gen ( 3 ) -z *gen ( 2 ) + 1 *gen ( 4 ) 

Show that the displayed generators are Grobner bases with respect to 
the orderings prescribed by Schreyer’s Theorem from Chapter 5, §3. 
c. Explain why using Schreyer’s Theorem produces a longer resolution in 
this case than that displayed in §1. 

Exercise 7. Find a free resolution of length 1 of the form given in 
Proposition (2.6) for the ideal 

I = {x^ - x^y, x^y - x'^y^, x^y^ - xy^, xy^ - y^) 

in ii = k[x^y]. Identify the matrix A and the element g ^ Rm this case 
in Proposition (2.6). Why is ^ 7^ 1? 

Exercise 8. Let 6 be a monic reduced Grobner basis for a submodule 
M C with respect to some monomial order. Assume that the leading 
terms of all the elements of Q are constant multiples of standard basis 
vectors in 

a. If Si is the leading term of some element of show that it is the leading 
term of exactly one element of Q. 

b. Show that Syz(^) = {0} C R^. 

c. Deduce that M is a free module. 

Exercise 9. In this exercise, we will sketch one way to show that every 
free resolution of the quotient R/ 1 for 

I = {x^ - X, xy, y^ -y) C R = k[x, y] 

has length > 2. In other words, the resolution 0 ^ R^ R^ ^ R ^ 
R/I 0 from Exercise 1 is as short as possible. We will need to use some 
ideas from Chapter 4 of this book. 

a. Let M be an J?-module, and let P be a maximal ideal in R. Generalizing 
the construction of the local ring i?p, define the localization of M at P, 
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written Mp, to be the set of “fractions” m//, where m G M, f ^ P, 
subject to the relation that m/f — m' / f whenever there is some g E 
g ^ P such that g{fm — /m') = 0 in M. Show that Mp has the 
structure of a module over the local ring iip. If M is a free i?-module, 
show that Mp is a free iip-module. 

b. Given a homomorphism (p : M N of /^-modules, show that there is 
an induced homomorphism of the localized modules pp : Mp Np 
defined by pp{m/f) = ip{m)/f for all m/f G Mp. Hint: First show 
that this rule gives a well-defined mapping from Mp to Np. 

c. Let 



Ml M2 M3 

be an exact sequence of i?-modules. Show that the localized sequence 
(Mi)p (M2 )p {Ms)p 



is also exact. 

d. We want to show that the shortest free resolution of M = R/I for 

I = — X, xy, — y) has length 2. Aiming for a contradiction, suppose 

that there is some resolution of length 1 : 0 Fi Fq — > M — > 0. 
Explain why we may assume Fq = R. 

e. By part c, after localizing at P = {x,y) D /, we obtain a resolu- 
tion 0 ^ (Pi)p Rp — > Mp — > 0. Show that Mp is isomorphic to 
Rp/{x^ y)Rp = fc as an Pp-module. 

f. But then the image of (Pi)p — ^ Rp must be (x, y). Show that we obtain 
a contradiction because this is not a free Pp-module. 

Exercise 10. In P = fc[xi, . . . , Xn], consider the ideals 

Ijn — (^1) ^2) • • • ) ^m} 

generated by subsets of the variables, for 1 < m < n. 

a. Find explicit resolutions for the ideals / 2 j • • • » -^5 in fc[a;i, . . . , X 5 ]. 

b. Show in general that Im has a free resolution of length m — 1 of the 
form 

0 r{Z) , i?(?) ij(?) _> 0, 

where if we index the basis Bk of R^^^ by fc-element subsets of 
{!,..., m}: 

Bk = f < - ' < ik < m), 

then the mapping pk R^^^ p(fc-i) in the resolution is defined by 

k 

j=l 
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where in the term with index ij is omitted to yield a (fc — l)-element 
subset. These resolutions are examples of Koszul complexes. See [Eis] 
for more information about this topic. 

Exercise 11. In this exercise, we will sketch a proof of Proposition (2.6). 

The basic idea is to consider the linear mapping from K'^~^ to defined 

by the matrix in a resolution 

0 4 4 ^ 0 , 

where K = k{xi, . . . ,Xn) is the field of rational functions (the field of 

fractions of R) and to use some linear algebra over K. 

a. Let V be the space of solutions of the the homogeneous system of linear 
equations XA = 0 where X € K'^ is written as a row vector. Show 
that the dimension over jFC of F is 1. Hint: The columns Ai, . . . , Am-i 
of A are linearly independent over i?, hence over K. 

b. Let B = {fi ... fm) and note that exactness implies that BA = 0. 
Let fi = (—1)*'^^ det(Ai), where Ai is the (m — 1) x (m — 1) submatrix 
of A obtained by deleting row i. Show that X = (/i, . . . , fm) is also an 
element of the space V of solutions oi X A = 0. Hint: append any one^f 
the columns of ^ to A to form an m x m matrix A, and expand det(A) 
by minors along the new column. 

c. Deduce that there is some r E K such that rfi = for alH = 1, . . . , m. 

d. Write r = g/h where g^h E R and the fraction is in lowest terms, and 
consider the equations gfi = hfi. We want to show that h must be a 
nonzero constant, arguing by contradiction. If not, then let p be any 
irreducible factor of h. Show that Ai, . . . , Am-i are linearly dependent 
modulo (p), or in other words that there exist ri, . . . , rm-i not all in 
(p) such that riAi + • . . + rm-iAm-i = pB for some B E R^. 

e. Continuing from part d, show that B E Syz(/i , . . . , fm) also, so that 
B = siAi + • • • + Sm-iAm-i for some Si E R. 

f. Continuing from part e, show that (ri — p5i, . . . , r^-i —psm-i)^ would 
be a syzygy on the columns of A. Since those columns are linearly in- 
dependent over ii, Ti — psi = 0 for all i. Deduce a contradiction to the 
way we chose the r^. 

g. Finally, in the case that V(7) has dimension n — 2, show that g must 
be a nonzero constant also. Hence by multiplying each fi by a nonzero 
constant, we could take p = 1 in Proposition (2.6). 



§3 Graded Resolutions 

In algebraic geometry, free resolutions are often used to study the homo- 
geneous ideals I = 1(F) of projective varieties F C and other modules 
over k[xQ ^ . . . , x^]. The key fact we will use is that these resolutions have 
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an extra structure coming from the grading on the ring R = k[xo , . . . , Xn], 
that is the direct sum decomposition 

(3.1) R = ^Rs 

s>0 

into the additive subgroups (or fc- vector subspaces) Rs = k[xo, , Xn]s^ 
consisting of the homogeneous polynomials of total degree s, together with 
0. To begin this section we will introduce some convenient notation and 
terminology for describing such resolutions. 

(3.2) Definition. A graded module over is a module M with a family 
of subgroups {Mt : t G Z} of the additive group of M. The elements of Mt 
are called the homogeneous elements of degree t in the grading, and the 
Mt must satisfy the following properties. 

a. As additive groups, 

M = 0Mt. 

tez 

b. The decomposition of M in part a is compatible with the multiplication 
by elements of R in the sense that RgMt C Mg^t for all s > 0 and all 
t G Z. 

It is easy to see from the definition that each Mt is a module over the 
subring Rq = k C R^ hence a A:-vector subspace of M. If M is finitely- 
generated, the Mt are finite dimensional over k. 

Homogeneous ideals I C R are the most basic examples of graded 
modules. Recall that an ideal is homogeneous if whenever / G /, the ho- 
mogeneous components of / are all in / as well (see for instance, [CLO], 
Chapter 8, §3, Definition 1). Some of the other important properties of 
these ideals are summarized in the following statement. 

• (Homogeneous Ideals) Let I C k[xo,...,Xn] be an ideal. Then the 
following are equivalent: 

a. 7 is a homogeneous ideal. 

b. 7 = {/i, . . . , /s) where fi are homogeneous polynomials. 

c. A reduced Grobner basis for 7 (with respect to any monomial order) 
consists of homogeneous polynomials. 

(See for instance [CLO], Theorem 2 of Chapter 8, §3.) 

To show that a homogeneous ideal 7 has a graded module structure, set 
It = I n Rt. For t > 0, this is the set of all homogeneous elements of total 
degree t in 7 (together with 0), and R = {0} for t < 0. By the definition 
of a homogeneous ideal, we have 7 = 0tez^t, and Rglt C Ig-^-t is a direct 
consequence of the definition of an ideal and the properties of polynomial 
multiplication. 
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The free modules are also graded modules over R provided we take 
= (Rt)^. We will call this the standard graded module structure on 
R^. Other examples of graded modules are given by submodules of the free 
modules RJ^ with generating sets possessing suitable homogeneity proper- 
ties, and we have statements analogous to those above for homogeneous 
ideals. 

(3.3) Proposition. Let M C R^ be submodule. Then the following are 
equivalent. 

a. The standard grading on R^ induces a graded module structure on M, 
given by taking Mt = (Rt)^ H M — the set of elements in M where each 
component is a homogeneous polynomial of degree t (or 0). 
h. M = in where each fi is a vector of homogeneous 

polynomials of the same degree di . 

c. A reduced Grobner basis (for any monomial order on R^) consists of 
vectors of homogeneous polynomials where all the components of each 
vector have the same degree. 

Proof. The proof is left to the reader as Exercise 8 below. □ 



Submodules, direct sums, and quotient modules extend to graded mod- 
ules in the following ways. If M is a graded module and iV is a submodule 
of M, then we say AT is a graded submodule if the additive subgroups 
iVt = Mt n iV for t G Z define a graded module structure on N. For exam- 
ple, Proposition (3.3) says that the submodules M = {/i, . . . , /r) in R^ 
where each fi is a vector of homogeneous polynomials of the same degree 
di are graded submodules of R^. 



Exercise 1. 

a. Given a collection of graded modules Mi, . . . , Mm, we can produce the 
direct sum N = Mi 0 • • • 0 Mm as usual. In iV, let 

Nf = (Mi)t 0 • • • 0 {Mm)t- 

Show that the Nt define the structure of a graded module on N. 

b. If AT C M is a graded submodule of a graded submodule M, show that 
the quotient module M/N also has a graded module structure, defined 
by the collection of additive subgroups 

{M/N)t = Mt/Nt = Mt/{Mt n N). 



Given any graded i?-module M, we can also produce modules that are 
isomorphic to M as abstract R-modules, but with different gradings, by 
the following trick of shifting the indexing of the family of submodules. 
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(3.4) Proposition. Let M be a graded R-module, and let d be an integer. 
Let M{d) be the direct sum 

M(d) = 0M(d)„ 

tez 

where M{d)t = Then M{d) is also a graded R-module. 

Proof. The proof is left to the reader as Exercise 9. □ 

For instance, the modules {R!^){d) = R{d)^ are called shifted or twisted 
graded free modules over R. The standard basis vectors still form a 
module basis for but they are now homogeneous elements of degree 

—d in the grading, since R{d)-d = Rq. More generally, part a of Exercise 1 
shows that we can consider graded free modules of the form 

R{di) 0 • • • 0 R{dfji) 

for any integers di, . . . , where the basis vector ei is homogeneous of 
degree —di for each i. 

Exercise 2. This exercise will generalize Proposition (3.3). Suppose that 
we have integers di, . . . , dm and elements /i, . . . , /s G R^ such that 

fi • • • j fim) 

where the fij are homogeneous and deg fn+di = • • • = deg fim + dm 
for each i. Then prove that M = (/i, . • • , /«) is a graded submodule of 
F = R{di) 0 • • • 0 R{dm)- Also show that every graded submodule of F 
has a set of generators of this form. 

As the examples given later in the section will show, the twisted free 
modules we deal with are typically of the form 

R{—di) 0 • • • 0 R{—dm)’ 

Here, the standard basis elements ei,...,em have respective degrees 
di, . . . , dm* 

Next we consider how homomorphisms interact with gradings on 
modules. 

(3.5) Definition. Let M, N be graded modules over R. A homomorphism 
(p : M N is said to a graded homomorphism of degree d if p{Mt) C NtJ^d 
for all t E Z. 

For instance, suppose that M is a graded ii-module generated by homo- 
geneous elements /i, . . * , /m of degrees di, . . . , dm* Then we get a graded 
homomorphism 



ip : R{-di) 0 • • • 0 R{-dm) — ^ M 
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which sends the standard basis element to fi G M. Note that (p is onto. 

Also, since has degree di, it follows that p has degree zero. 

Exercise 3. Suppose that M is a finitely generated i2-module. As usual, 

Mt denotes the set of homogeneous elements of M of degree t. 

a. Prove that Mt is a finite dimensional vector space over the field k and 
that Mt = {0} for t 0. Hint: Use the surjective map (p constructed 
above. 

b. Let : M — > M be a graded homomorphism of degree zero. Prove that 

^ is an isomorphism if and only if is onto for every t. 

Conclude that ^ is an isomorphism if and only if it is onto. 



Another example of a graded homomorphism is given by an m x p matrix 
A all of whose entries are homogeneous polynomials of degree d in the 
ring R. Then A defines a graded homomorphism (p of degree d by matrix 
multiplication 



p:RF 

f^Af. 

If desired, we can also consider A as defining a graded homomorphism of 
degree zero from the shifted module R{—dY to RJ^. Similarly, if the entries 
of the jth column are all homogeneous polynomials of degree dj, but the 
degree varies with the column, then A defines a graded homomorphism of 
degree zero 



R{-di) 0 • • • 0 R{-dp) RT, 



Still more generally, a graded homomorphism of degree zero 



R{-di) 0 • • • 0 R{-dp) R{-ci) 0 • • • 0 R{-Cm) 

is defined by an m x p matrix A where the ij entry aij G R is homogeneous 
of degree dj — Ci for all i^j. We will call a matrix A satisfying this condition 
for some collection dj of column degrees and some collection Ci of row 
degrees a graded matrix over R. 

The reason for discussing graded matrices in detail is that these matrices 
appear in free resolutions of graded modules over R. For example, consider 
the resolution of the homogeneous ideal 

M = {z^ — yz — xw, y^ — x^z^ xz^ — y^w) 

in i? = fc[x, p, 2 ;, w] from (1.15) of this chapter, computed using Macaulay. 
The ideal itself is the image of a graded homomorphism of degree zero 

i?(-3) 0 R{-2) 0 jR, 
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where the shifts are just the negatives of the degrees of the generators, 
ordered as above. The next matrix in the resolution: 



0 


—X 


0 


-y\ 


xz 


yw 






—w 


0 


—z 


0 


\-y 




—X 


w ) 



(whose columns generate the module of syzygies on the generators of M) 
defines a graded homomorphism of degree zero 

i?(-4)^ 4 R{-3) © R{-2) © R{-3f. 

In other words, dj — 4 for all j, and ci = C3 = C4 = 3, C2 = 2 in the 
notation as above, so all entries on rows 1, 3, 4 of A are homogeneous of 
degree 4 — 3 = 1, while those on row 2 have degree 4 — 2 = 2. The whole 
resolution can be written in the form 

(3.6) 0 ^ ^ R{-3) © R{-2) © R{-3f M ^ 0, 

where all the arrows are graded homomorphisms of degree zero. 

Here is the precise definition of a graded resolution. 

(3.7) Definition. If M is a graded i^-module, then a graded resolution of 
M is a resolution of the form 

,F2nF,nFo^M->0, 

where each Fi is a twisted free graded module R{—di) 0 • • • 0 R{—dp) and 
each homomorphism (/?£ is a graded homomorphism of degree zero (so that 
the (fi are given by graded matrices as defined above). 

The resolution given in (3.6) is clearly a graded resolution. What’s nice 
is that every finitely generated graded i?-module has a graded resolution 
of finite length. 

(3.8) Theorem (Graded Hilbert Syzygy Theorem). Let R = 

k[xi , . . . , Xn]- Then every finitely generated graded R-module has a finite 
graded resolution of length at most n. 

Proof. This follows from the proof of Theorem (2.1) (the Syzygy The- 
orem in the ungraded case) with minimal changes. The reason is that by 
Proposition (3.3) and the generalization given in Exercise 2, if we apply 
Schreyer’s theorem to find generators for the module of syzygies on a ho- 
mogeneous ordered Grobner basis {gi, • • • ,gs) for a graded submodule of 
R{—di) 0 • • • 0 R{—dp)^ then the syzygies Sij are also homogeneous and 
“live” in another graded submodule of the same form. We leave the details 
of the proof as Exercise 5 below. □ 
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The res and nres commands in Macaulay will compute a finite graded 
resolution using the method outlined in the proof of Theorem (3.8). 
However, the resolutions produced by Macaulay are of a very special sort. 

(3.9) Definition. Suppose that 

. . • — > ^ Fi—\ — ^ * • • — ^ Fq — M — y 0 

is a graded resolution of M. Then the resolution is minimal if for every 
£ > 1, the nonzero entries of the graded matrix of (p£ have positive degree. 

For an example, the reader should note that the resolution (3.6) is a 
minimal resolution. But not all resolutions are minimal, as shown by the 
following example. 

Exercise 4. Show that the resolution from (1.11) can be homogenized to 
give a graded resolution, and explain why it is not minimal. Also show that 
the resolution from (1.10) is minimal after we homogenize. 

In Macaulay, nres computes a minimal resolution. The resolutions pro- 
duced by res, on the other hand, might not be minimal, but they are 
minimal after the first step, i.e., they satisfy Definition (3.9) for all £> 2. 

We will soon see that minimal resolutions have many nice properties. 
But first, let’s explain why they are called “minimal”. We say that a set 
of generators of a module is minimal if no proper subset generates the 
module. Now suppose that we have a graded resolution 

... — > ^ F^—i — y • • • — y Fq — > M — y 0. 

Each (fi gives a surjective map Fi — ^ im((/?£), so that ipi takes the stan- 
dard basis of Fi to a generating set of im((^^). Then we can characterize 
minimality as follows. 

(3.10) Proposition. The above resolution is minimal if and only if for 
each £> 0, (pi takes the standard basis of Fi to a minimial generating set 
o/im(v?f). 

Proof. We will prove one direction and leave the other as an exercise. 
Suppose that for some £ > \ the graded matrix Ai of pi has entries of 
positive degree. We will show that pi-i takes the standard basis of Fi-\ to 
a minimal generating set of im((^^_i). Let ei, . . . , Cm be the standard basis 
vectors of Fi-\. If (^^_i(ei), . . . , (pi-\{em) is not a minimal generating set, 
then some (pi-\(ei) can be expressed in terms of the others. Reordering the 
basis if necessary, we can assume that 

m 

<Pe-i{ei) = '^anpe-i{ei), Oi £ R. 

i=2 
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Then ipe-i{ei - a 262 amem) = 0, so (1, -a 2 , . . . , -am) ^ ker((^£_i). 

By exactness, (1, — a 2 , . . . , -dm) ^ Since Ai is the matrix of (p£^ 

the columns of Ai generate We are assuming that the nonzero 

components of these columns have positive degree. Since the first entry of 
(1, — a 2 , . . . , —dm) is a nonzero constant, it follows that this vector cannot 
be an /2-linear combination of the columns of Ai. This contradiction proves 
that the (pe-i{ei) give a minimal generating set of im(<^^_i). □ 

The above proposition shows that minimal resolutions are very intuitive. 
For example, suppose that we have built a graded resolution of an J?-module 
M out to stage £ — 1: 

Fi—i — > -Fjf— 2 — ^ * * * — ^ ^0 — ^ ^ ^ 0* 

We extend one more step by picking a generating set of ker((^^_i) and 
defining ipe : Fi ker((^^_i) C by mapping the standard basis of 

Fi to the chosen generating set. To be efficient, we should pick a minimal 
generating set, and if we do this at every step of the construction, then 
Proposition (3.10) guarantees that we get a minimal resolution. 

Exercise 5. Give a careful proof of Theorem (3.8) (the Graded Syzygy 
Theorem), and then modify the proof to show that every finitely generated 
graded module over k[xi , . . . , Xn] has a minimal resolution of length < n. 
Hint: Use Proposition (3.10). 

We next discuss to what extent a minimal resolution is unique. The first 
step is to define what it means for two resolutions to be the same. 

(3.11) Definition. Two graded resolutions • • • — > Fq ^ M — > 0 and 

• • • — > Go ^ M — > 0 are isomorphic if there are graded isomorphisms 
a£ : F£ — > G£ of degree zero such that i/jo o ao = ^po and, for every £ > 1, 
the diagram 

F£ ^ F,_i 

(3.12) Oi£ i i a£-i 

/nr /O 

U£ > G^-1 

commutes, meaning a£-i o ip£ = 'ip£ o a£. 

We will now show that a finitely generated graded module M has a 
unique minimal resolution up to isomorphism. 

(3.13) Theorem. Any two minimal resolutions of M are isomorphic. 

Proof. We begin by defining ao : Fq Go- If ei, . . . , is the standard 
basis of Fo, then we get (poi^i) G M, and since Go — > M is onto, we can 
find Qi G Go such that 'ipoiQi) = Then setting ao{ei) = gi defines a 
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graded homomorphism qq '• Go of degree zero, and it follows easily 

that 'ipo o ao = ipo- 

A similar argument gives /3o : Go also a graded homomorphism 

of degree zero, such that (fo ^ Po ~ '^o- Thus /3 q o Q^o • if 

Ifo : i^o ^0 denotes the identity map, then 

(3.14) (po o (liTo - /?o o oio) == Po - {^0 o /?o) o ao = (^0 - o = 0. 

We claim that (3.14) and minimality imply that f3o o <^o is an isomorphism. 

To see why, first recall from the proof of Proposition (3.10) that the 
columns of the matrix representing (pi generate im{(pi). By minimal- 
ity, the nonzero entries in these columns have positive degree. If we let 
(a^i, . . . , Xn)Fo denote the submodule of Fo generated by xiCj for all i, j, 
it follows that im((^i) C (xi, . . . , Xn)Fo. 

However, (3.14) implies that im(lFo — /?o ° Q^o) C ker{ipo) = im(v?i). By 
the previous paragraph, we see that v — /3o o ao(t>) G (xi, . . . , Xn)Fo for 
all u E Fq. In Exercise 11 at the end of the section, you will show that this 
implies that /3 q o is an isomorphism. In particular, ao is one-to-one. 

By a similar argument using the minimality of the graded resolution 
• • • — > Go — ^ M — ^ 0, ao o ^0 is also an isomorphism, which implies 
that ao is onto. Hence ao is an isomorphism as claimed. Then Exercise 
12 at the end of the section will show that ao induces an isomorphism 
ao : ker((^o) ker('0o)- 

Now we can define ai. Since <pi : Fi im(<^i) = ker((^o) is onto, we 
get a minimal resolution 



> Fi ^ ker(¥>o) ^ 0, 

of ker((^o) (see Exercise 7 of §1), and similarly 

^ Gi ^ keiiipo) -> 0 

is a minimal resolution of ker('0o)- Then, using the isomorphism ao : 
ker((^o) — ^ ker(-0o) just constructed, the above argument easily adapts 
to give a graded isomorphism a\ : F\ — > Gi of degree zero such that 
ao o ipi = xjji o ai. Since ao is the restriction of ao to im((/?i), it follows 
easily that (3.12) commutes (with £ = 1). 

If we apply Exercise 12 again, we see that a\ induces an isomorphism 
Oil : ker(<^i) ker('0i). Repeating the above process, we can now define 
a 2 with the required properties, and continuing for all £, the theorem now 
follows easily. □ 

Since we know by Exercise 5 that a finitely generated i?-module M has a 
finite minimal resolution, it follows from Theorem (3.13) that all minimial 
resolutions of M are finite. This fact plays a crucial role in the following 
refinement of the Graded Syzygy Theorem. 
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(3.15) Theorem. If 

• • • — > F£ ^ Fi—i — > • • • — > Fq — > M — > 0, 

is any graded resolution of M over k[xi ^ . . . , Xn], then the kernel ker((^n-i) 
is free, and 

0 — > ker((/?n-i) Fn-\ ^ Fq M 0 

is a graded resolution of M. 

Proof. We begin by showing how to simplify a given graded resolution 

• • • Fq M Suppose that for some ^ > 1, > F^-l is 

not minimal, i.e., the matrix A(, of ip£ has a nonzero entry of degree zero. 
If we order the standard bases {ei, . . . , e-m} of Fe and {ui, . . . , Ut} of Fj?_i 
appropriately, we can assume that 

(3.16) ^e{vi) = ciUi + C 2 U 2 + h CtUt 

where c\ is a nonzero constant (note that (ci, . . . , is the first column 
of Ai). Then let Gi C F^ and G^_i C F^_i be the submodules generated 
by {c 2 , . . . , Cm} and { 1 x 2 , . . . , Ut} respectively, and define the maps 

jp '^i+i ^ ^ '^e—i jp 

re-\-l Cjr £-1 — ^ J^£-2 

as follows: 

• ip£^i is the projection F^ — > (which sends aiCi + 0202 H h amem 

to 0202 H h cimem) composed with (p£+i. 

• If the first row of is (ci, ^ 2 , . . . , c?m)j then -0^ is defined by 'ip£(ei) = 
(P£(ei — ^ ei) for X = 2, . . . , m. Since (p£(ei) = diUi + • • • for i > 2, it 
follows easily from (3.16) that '0^(e^) G G^_i. 

• 2 p£-i is the restriction of (p£-i to the submodule G^_i C F^_i. 

We claim that 

jp <pe+i jp r> n TP "^± 1 ^ J? 

• • • — > r£^2 ^£-\-l — > — > G^_i — > r£-2 ^£-^ — > • • • 

is still a resolution of M. To prove this, we need to check exactness at 
G^, G^_i and F^_ 2 . (If we set M — F_i and F^ = 0 for fc < —1, then the 
above sequence makes sense for all £ > 1.) 

We begin with F^_ 2 . Here, note that applying (f£-i to (3.16) gives 

0 = Ciip£-i{ui) + C2(p£-l{u2) + • • • + C2^£-l{Um)- 

Since c\ is a nonzero constant, (p£~i{ui) is a ii-linear combination of 
(f£-i{ui) for i = 2, ...,m, and then im(<^^_i) = im('0^_i) follows from 
the definition of 'ip£~i. The desired exactness im(t/;^_i) = ker((^£_ 2 ) is now 
an easy consequence of the exactness of the original resolution. 

Next consider G^_i. First note that for x > 2, 'ip£-i o -0^(6^) = 'il^£-i o 
^£{^i — ^^ 1 ) = 0 since ip£-i is just the restriction of (p£~i. This shows 
that im(^^) C ker(V^^_i). To prove the opposite inclusion, suppose that 
7 p£_i{v) = 0 for some v G G^_i. Since is the restriction of (p£~i, 
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exactness of the original resolution implies that v = (pi{aiei H \-dmGm)- 

However, since Ui does not appear in z; G G^-i and (peici) = diUi + • • •, 
one easily obtains 

(3.17) CLiCi + 02^2 ■+■ * * * “h Oim^m — 0 
by looking at the coefiicients of Ui. Then 

+ * * * + ) — a2^^(C2) + • • * + dm'4’e{^m) 

= a2Pi{e2 — ^l) + • • • + ~ 

= -f- 0-2^2 -f- * * * 4“ Clm^m) ~ 

where the last equality follows by (3.17). This completes the proof of 
exactness at Gi-i. 

The remaining proofs of exactness are straightforward and will be covered 
in Exercise 13 at the end of the section. 

Since the theorem we’re trying to prove is concerned with ker((^n-i)? we 
need to understand how the kernels of the various maps change under the 
above simplification process. If ei G Fi has degree d, then we claim that: 

ker{(pe-i) ~ R{-d) 0 ker(^^_i) 

(3.18) keT{(pi) ~ ker(^^) 

ker(<^^+i) = ker(^^-f.i) 

We will prove the first and leave the others for the reader (see Exer- 
cise 13). Since is the restriction of <P£_i, we certainly have ker('0^_i) C 
ker((/?^_i). Also, (pe{ei) G ker((^^_i) gives the submodule Ripi{ei) C 
ker((^£_i), and the map sending ipi{ei) 1 induces an isomorphism 
R^ii^i) — R{—d). To prove that that we have a direct sum, note that (3.16) 
implies R(pi{ei)nGe-i = {0} since G^_i is generated by iZ 2 , • • • , Um and ci 
is a nonzero constant. From this, we conclude R(pi{ei) fl ker(^^_i) = {0}, 
which implies 

R^i{ei) + ker('0£_i) = R(pi{ei) 0 ker(z/;^_i). 

To show that this equals all of ker(<p^_i), let w G ker(^^_i) be arbitrary. If 
w = ai-ui + • • • + atUt, then setw = w—^ By (3.16), we have w G 

G^_i, and then w G ker('0^_i) follows easily. Thus w = + fi; G 

R^ii^i) 0 ker('0^_i), which gives the desired direct sum decomposition. 

Hence, we have proved that whenever we have a (pi with a nonzero matrix 
entry of degree zero, we create a resolution with smaller matrices whose 
kernels satisfy (3.18). It follows that if the theorem holds for the smaller 
resolution, then it automatically holds for the original resolution. 

Now the theorem is easy to prove. By repeatedly applying the above pro- 
cess whenever we find a nonzero matrix entry of degree zero in some we 
can reduce to a minimal resolution. But minimal resolutions are isomorphic 
by Theorem (3.13), and hence, by Exercise 5, the minimal resolution we get 
has length < n. Then Proposition (1.12) shows that ker{(pn-i) is free for 
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the minimal resolution, which, as observed above, implies that ker{(pn-i) 
is free for the original resolution as well. 

The final assertion of the theorem, that 

0 ker{(pn-i) Fn-i ->•••-> Fq -> M -> 0 

is a free resolution, now follows immediately from Proposition (1.12). □ 

The simplification process used in the proof of Theorem (3.15) can be 
used to show that, in a suitable sense, every graded resolution of M is the 
direct sum of a minimal resolution and a trivial resolution. This gives a 
structure theorem which describes all graded resolutions of a given finitely 
generated module over k[x\, . . . ,Xn]- Details can be found in Theorem 20.2 
of [Eis]. 

Exercise 6. Show that the simplification process from the proof of Theo- 
rem (3.15) transforms the homogenization of (1.11) into the homogenization 
of (1.10) (see Exercise 4). 

There is also a version of the theorem just proved which applies to partial 
resolutions. 

(3.19) Corollary. If 

Fu-i ^ Fo ^ M ^ 0 

is a partial graded resolution over k[x \^ . . . , Xn], then ker{ipn-i) is free, and 
0 — > ker((^n-i) Fn—i — > • • • ^ Fq — ^ ^ 0 

is a graded resolution of M. 

Proof. Since any partial resolution can be extended to a resolution, this 
follows immediately from Theorem (3.15). □ 

One way to think about Corollary (3.19) is that over k[xi, . . . , the 
process of taking repeated syzygies leads to a free syzygy module after at 
most n—l steps. This is essentially how Hilbert stated the Syzygy Theorem 
in his classic paper [Hil], and sometimes Theorem (3.15) or Corollary (3.19) 
are called the Syzygy Theorem. Modern treatments, however, focus on 
the existence of a resolution of length < n, since Hilbert’s version follows 
from existence (our Theorem (3.8)) together with the properties of minimal 
resolutions. 

As an application of these results, let’s study the syzygies of a 
homogenous ideal in two variables. 

(3.20) Proposition. Suppose that /i, . . . , /s € k[x, y] are homogeneous 
polynomials. Then the syzygy module Syz (/i, . . . , /«) is a twisted free 
module over k[x, y]. 
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Proof. Let I — (/i, • • • , /s) C k[x^ y]. Then we get an exact sequence 
0-^ I R-^ R/I -^0 

by Proposition (1.2). Also, the definition of the syzygy module gives an 
exact sequence 

0 — > Syz (/i, . . . , fs) — > R{—di) 0 • • • 0 R{—ds) > 0 

where di — deg fi. Splicing these two sequences together as in Exercise 7 
of §1, we get the exact sequence 

0 ^ Syz (/i, R{-di) © • • • © R{-ds) R R/I ^ 0. 

Since n = 2, Corollary (3.19) implies that ker(<^i) = Syz (/i, . . . , /g) is 
free, and the proposition is proved. □ 

In §4, we will use the Hilbert polynomial to describe the degrees of the 
generators of Syz (/i, . . . , /s) in the special case when all of the fi have the 
same degree. 



Additional Exercises for §3 

Exercise 7. Assume that /i, . . . , /« G k[x,y] are homogeneous and not 
all zero. We know that Syz (/i, . . . , /s) is free by Proposition (3.20), so 
that if we ignore gradings, Syz (/i, . . . , /s) ^ R^ for some m. This gives 
an exact sequence 

0 -> 0. 

Prove that m = s — 1 and conclude that we are in the situation of 
the Hilbert-Burch Theorem from §2. Hint: As in Exercise 11 of §2, let 
K = fc(xi, . . . ^Xn) be the field of rational functions coming from R = 
k[xi, . . . , Xn]^ Explain why the above sequence gives a sequence 

and show that this new sequence is also exact. The result will then follow 
from the dimension theorem of linear algebra (see Exercise 8 of §1). The 
ideas used in Exercise 11 of §2 may be useful. 

Exercise 8. Prove Proposition (3.3). Hint: Show a c =4- b a. 

Exercise 9. Prove Proposition (3.4). 

Exercise 10. Complete the proof of Proposition (3.10). 

Exercise 11. Suppose that M is a module over fc[xi, . . . , Xn] generated 
by /i, . . . , fm- As in the proof of Theorem (3.13), let {xi^ . . . , Xn)M be the 
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submodule generated by Xifj for all i, Also assume that 'ip : M M is a. 
graded homomorphism of degree zero such that v — g{v) G (xi, . . . , Xn)M 
for all V e M. Then prove that 'ip is an isomorphism. Hint: By part b of 
Exercise 3, it suffices to show that 'ip : Mt Mt is onto. Prove this by 
induction on t, using part a of Exercise 3 to start the induction. 

Exercise 12. Suppose that we have a diagram of i?-modules and 
homomorphisms 

A B 

a I I 13 

C D 

which commutes in the sense of Definition (3.11). If in addition (p^ip are 
onto and a,/? are isomorphisms, then prove that a restricted to ker((p) 
induces an isomorphism a : ker(i^) — ^ ker(^). 

Exercise 13. This exercise is concerned with the proof of Theorem (3.15). 
We will use the same notation of that proof, including the sequence of 
mappings 

^ Fi+i Gt ^ Gi-i ^ • 

a. Prove that ~ ^ ^i^i) — ^ 

dlCl 4 - = 0 - 

b. Use part a to prove that the above sequence is exact at Ge. 

c. Prove that the above sequence is exact at Hint: Do you see why 
it suffices to show that ker((/?^_j_i) = ker('0^-i_i)? 

d. Prove the second line of (3.18), i.e., that ker((^£) ~ ker(^^). Hint: Use 
part a. 

e. Prove the third line of (3.18), i.e., that ker((^^ 4 _i) = ker(^^_i_i). Hint: 
You did this in part c! 

Exercise 14. In the proof of Theorem (3.15), we constructed a certain 
homomorphism 'ip : Ge G^-i- Suppose that is the matrix of : 
Fi —> Fi-i with respect to the bases Ci, . . . , of and ui, . . . , u* of 
Write Ae in the form 




where Aqo = ci and Aqi = (c2,...,q) as in ( 3 . 16 ), and Aio = 
{(I2, . . . , dm)^ 1 where the di are from the definition of 'ipe. If we let be 
the matrix of xpi with respect to the bases 62, . . . , of and U2, . . . , u* 
of G^_i, then prove that 

Bi = Aoo — AqiAqo^Aiq. 
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What’s remarkable is that this formula is identical to equation (6.5) in 
Chapter 3. As happens often in mathematics, the same idea can appear in 
very different contexts. 

Exercise 15. In fc[xo, . . . , Xn]^ n>2^ consider the homogeneous ideal In 
defined by the determinants of the ( 2 ) 2 x 2 submatrices of the 2 x n matrix 

j\^ f ^0 * * * Xn—l \ 

~ \Xi X2 • • • Xn ) ’ 

For instance, I 2 = {xqX 2 — x\) is the ideal of a conic section in P^. We have 
already seen in different notation (where?). 

a. Show that In is the ideal of the rational normal curve of degree n in 
P^ — the image of the mapping given in homogeneous coordinates by 

: pi 

(s,t) (s^,s^-4,...,s^-^^). 

b. Do explicit calculations to find the graded resolutions of the ideals I^^h- 

c. Show that the first syzygy module of the generators for In is generated 
by the three-term syzygies obtained by appending a copy of first (resp. 
second) row of M to M, to make a 3 x n matrix M' (resp. M"), then 
expanding the determinants of all 3 x 3 submatrices of M' (resp. M") 
along the new row. 

d. Conjecture the general form of a graded resolution of In- (Proving this 
conjecture requires advanced techniques like the Eagon-N orthcott com- 
plex. This and other interesting topics are discussed in Appendix A2.6 
of [Eis].) 



§4 Hilbert Polynomials and Geometric Applications 

In this section, we will study Hilbert functions and Hilbert polynomials. 
These are computed using the graded resolutions introduced in §3 and con- 
tain some interesting geometric information. We will then give applications 
to the ideal of three points in P^, parametric equations in the plane, and 
invariants of finite group actions. 

Hilbert Functions and Hilbert Polynomials 

We begin by defining the Hilbert function of a graded module. Because we 
will be dealing with projective space P^, it is convenient to work over the 
polynomial ring R = k[xo , . . . , Xn] in n + 1 variables. 

If M finitely generated graded ii-module, recall from Exercise 3 of §3 
that for each t, the degree t homogeneous part Mt is a finite dimensional 
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vector space over k. This leads naturally to the definition of the Hilbert 
function. 

(4.1) Definition. If M is a finitely generated graded module over R = 
fc[xo, . . . , Xn], then the Hilbert function HM{t) is defined by 

Unfit) = dimfc Mt, 

where as usual, dim^ means dimension as a vector space over k. 

The most basic example of a graded module is i? = fc[xo, . . . , Xn] itself. 
Since Rt is the vector space of homogeneous polynomials of degree t in 
n + 1 variables, Exercise 19 of Chapter 3, §4 implies that for t > 0, we 
have 

HrH) = dimk Rt = 

If we adopt the convention that (^) = 0 if a < 6, then the above formula 
holds for all t. Similarly, the reader should check that the Hilbert function 
of the twisted module R(d) is given by 

(4.2) tez. 

An important observation is that for t > 0 and n fixed, the binomial 
coeflScient is a polynomial of degree n in t. This is because 

t + n)! _ {t n){t + n — 1) • • • (t -f 1) 

n J t\n\ n\ 

It follows that Hji{s) is given by a polynomial for t sufficiently large (t > 0 
in this case). This will be important below when we define the Hilbert 
polynomial. 

Here are some exercises which give some simple properties of Hilbert 
functions. 

Exercise 1. If M is a finitely generated graded i^-module and M{d) is 
the twist defined in Proposition (3.4), then show that 

= HM{t + d) 

for all t. Note how this generalizes (4.2). 

Exercise 2. Suppose that M, N and P are finitely generated graded R- 
modules. 

a. The direct sum M 0 AT was discussed in Exercise 1 of §3. Prove that 
Hm®n = Hm + Hn- 

b. More generally, if we have an exact sequence 

0^ M P N ^0 
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where a and /? are graded homomorphisms of degree zero, then show 
that Hp = Hm Hpf. 

c. Explain how part b generalizes part a. Hint: What exact sequence do 
we get from M 0 N? 

It follows from these exercises that we can compute the Hilbert func- 
tion of any twisted free module. However, for more complicated modules, 
computing the Hilbert function can be rather nontrivial. There are several 
ways to study this problem. For example, if / C i2 = k[xo, . . . , Xn] is a 
homogeneous ideal, then the quotient ring R/I is a. graded i2-module, and 
in Chapter 9, §3 of [CLO], it is shown than if (lt(/)) is the ideal of initial 
terms for a monomial order on R, then the Hilbert functions Hp/j and 
Hr/{ut{i)) are equal. Using the techniques of Chapter 9, §2 of [CLO], it is 
relatively easy to compute the Hilbert function of a monomial ideal. Thus, 
once we compute a Grobner basis of /, we can find the Hilbert function of 
R/I. (Note: The Hilbert function Hpn is denoted HFj is [CLO].) 

A second way to compute Hilbert functions is by means of graded 
resolutions. Here is the basic result. 

(4.4) Theorem. Let R = fc[a;o, • • • , Xn] and let M be a graded R-module. 
Then, for any graded resolution of M 

0 — > Fjfe — > Fk-i —>•••—> Fo —> M —> 0, 



HM{t) = dimfc Mt = dixnk{Fj)t = Y,{-iyHFyt). 

j=0 j=0 

Proof. In a graded resolution, all the homomorphisms are homogeneous 
of degree zero, hence for each t, restricting all the homomorphisms to the 
degree t homogeneous parts of the graded modules, we also have an exact 
sequence of finite dimensional A:- vector spaces 

0 — > (Fk)t (Ffc-i)t — > • • • (Fo)t — ^ Mt 0. 

The alternating sum of the dimensions in such an exact sequence is 0, by 
Exercise 8 of §1. Hence 

k 

dimfc Mt = dimfc(jFj)t, 

j=o 

and the theorem follows by the definition of Hilbert function. □ 

Since we know the Hilbert function of any twisted free module (by (4.2) 
and Exercise 2), it follows that the Hilbert function of a graded module 
M can be calculated easily from a graded resolution. For example, let’s 
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compute the Hilbert function of the homogeneous ideal I of the twisted 
cubic in P^, namely 

(4.5) I = {xz — 2 /^, xw - yz, yw — z^) C R — k[x^ 2 /, . 2 ;, w]. 

In Exercise 2 of §2 of this chapter, we found that / has a graded resolution 
of the form 



0 -> i?(-3)2 R{-2f 0. 

As in the proof of Theorem (4.4), this resolution implies 
dimfc It = dimfc R{-2)^ - dim^ R{-S)t 
for all t. Applying Exercise 2 and (4.2), this can be rewritten as 

Using the exact sequence 0— > / -^R-^R/I -^0, Exercise 2 implies that 

= HR{t) - Hi(t) = (* + ^) _ “^ ^) + 2 Q) 

for all t. For t = 0, 1, 2, one (or both) of the binomial coefficients from Hj 
is zero. However, computing separately for t < 2 and doing some 

algebra, one can show that 

(4.6) Hn/iit) = 3t + 1 
for all t > 0. 

In this example, the Hilbert function is a polynomial once t is sufficiently 
large (t > 0 in this case). This is a special case of the following general 
result. 

(4.7) Proposition. If M is a finitely generated R-module, then there is 
a unique polynomial HPm such that 

HM{t) = HPM(t) 

for all t sufficiently large. 

Proof. The key point is that for a twisted free module of the form 
F = R{-di) 0 • • • 0 R{-dm), 

Exercise 2 and (4.2) imply that 

i=l ^ 
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Furthermore, (4.3) shows that this is a polynomial in t provided t > 
max(di , • • • , drn)‘ 

Now suppose that M is a finitely generated i?-module. We can find a 
finite graded resolution 

0 — y • • • — ^ Fq — > M — > 0, 

and Theorem (4.4) tells us that 

£ 

HM{t) = 

j=0 

The above computation implies that Hpjit) is a polynomial in t for t 
sufiiciently large, so that the same is true for HM{t)- D 

The polynomial HPm given in Proposition (4.7) is called the Hilbert 
polynomial of M. For example, if I is the ideal given by (4.5), then (4.6) 
implies that 

(4.8) HPn/i{t) = 3t + l 

in this case. 

The Hilbert polynomial contains some interesting geometric information. 
For example, a homogeneous ideal I C fc[xo, . . . , Xn] determines the pro- 
jective variety V = V{I) C and the Hilbert polynomial tells us the 
following facts about V : 

• The degree of the Hilbert polynomial HPjifi is the dimension of the 
variety V. For example, in Chapter 9 of [CLO], this is the definition of 
the dimension of a projective variety. 

• If the Hilbert polynomial HPr/j has degree d = dim F, then one can 
show that its leading term is {D/d\)f^ for some positive integer D. 
The integer D is defined to be the degree of the variety V, One can 
also prove that D equals the number of points where V meets a generic 
(n — d)-dimensional linear subspace of P’^. 

For example, the Hilbert polynomial HPji/i{t) = 3t + 1 from (4.8) shows 
that the twisted cubic has dimension 1 and degree 3. In the exercises at 
the end of the section, you will compute additional examples of Hilbert 
functions and Hilbert polynomials. 



The Ideal of Three Points 

Given a homogeneous ideal / C fc[a;o, . • • , we get the projective variety 
V = V(/). We’ve see that a graded resolution enables us to compute the 
Hilbert polynomial, which in turn determines geometric invariants of V 
such as the dimension and degree. However, the actual terms appearing in 
a graded resolution of the ideal I encode additional geometric information 
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about the variety V. We will illustrate this by considering the form of the 
resolution of the ideal of a collection of points in P^. For example, consider 
varieties consisting of three distinct points, namely V = {pi,P 2 iP 3 } C P^. 
There are two cases here, depending on whether the pi are collinear or not. 
We begin with a specific example. 

Exercise 3. Suppose that V = {pi,P 2 ,Ps} = {(0,0, 1), (1,0, 1), (0, 1, 1)}. 

a. Show that I = I{V) is the ideal — xz, xy, y‘^ — yz) C R = k[x, y, z]. 

b. Show that we have a graded resolution 

0 -> R{-3f ^ R{-2f 0 

and explain how this relates to (1.10). 

c. Compute that the Hilbert function of R/ 1 is 

_ f 1 if t = 0, 

“{3 ift>l. 

The Hilbert polynomial in Exercise 3 is the constant polynomial 3, so 
the dimension is 0 and the degree is 3, as expected. There is also some nice 
intuition lying behind the graded resolution 

(4.9) 0 R{-3f R{~2)^ 0 

found in part b of the exercise. First, note that Iq = {0} since 0 is the 
only constant vanishing on the points, and I\ = {0} since the points of 
V = {(0, 0, 1), (1, 0, 1), (0, 1, 1)} are noncollinear. On the other hand, there 
are quadratics which vanish on V. One way to see this is to let ^ij be the 
equation of the line vanishing on points pi and pj. Then fi = ^ 12 ^ 13 , /2 = 
^ 12 ^ 23 , fs = ^ 13^23 are three quadratics vanishing precisely on V. Hence 
it makes sense that I is generated by three quadratics, which is what the 
iZ(— 2)^ in (4.9) says. Also, notice that /i,/ 2,/3 have obvious syzygies of 
degree 1, for example, £ 23/1 ~ ^ 13/2 = 0. It is less obvious that two of 
these syzygies are free generators of the syzygy module, but this is what 
the R{—3)^ in (4.9) means. 

From a more sophisticated point of view, the resolution (4.9) is fairly 
obvious. This is because of the converse of the Hilbert-Burch Theorem 
discussed at the end of §2, which applies here since V C P^ is a finite set 
of points and hence is Cohen-Macaulay of dimension 2 — 2 = 0. 

The example presented in Exercise 3 is more general that one might sus- 
pect. This is because for three noncollinear points Pi, P 2 , P 3 , there is a linear 
change of coordinates on P^ taking Pi,P 2 jP 3 to (0, 0, 1), (1, 0, 1), (0, 1, 1). 
Using this, we see that if I is the ideal of any set of three noncollinear 
points, then I has a free resolution of the form (4.9), so that the Hilbert 
function of I is given by part c of Exercise 3. 
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The next two exercises will study what happens when the three points 
are collinear. 

Exercise 4. Suppose that V == {(0, 1, 0), (0, 0, 1), (0, A, 1)}, where A / 0. 
These points lie on the line a: = 0, so that F is a collinear triple of points. 

a. Show that I = 1{V) has a graded resolution of the form 

0 -> R{-4) R{-3) 0 ii(-l) ^ 0. 

Hint: Show that I = (x, yz{y — A 2 ;)). 

b. Show that the Hilbert function of R/I is 

{ 1 if t = 0, 

2 if t = 1, 

3 if t > 2. 

Exercise 5. Suppose now that V = {pi^P 2 ^Ps} is any triple of collinear 
points in P^. Show that / = 1(F) has a graded resolution of the form 

(4.10) 0 R{-4) -> R{-3) © i?(-l) -> / -> 0. 

and conclude that the Hilbert function of R/I is as in part b of Exercise 4. 
Hint: Use a linear change of coordinates in P^. 

The intuition behind (4.10) is that in the collinear case, V is the intersec- 
tion of a line and a cubic, and the only syzygy between these is the obvious 
one. In geometric terms, we say that F is a complete intersection in this 
case since its dimension (= 0) is the dimension of the ambient space (= 
2) minus the number of defining equations (= 2). Note that a noncollinear 
triple isn’t a complete intersection since there are three defining equations. 

This sequence of exercises shows that for triples of points in P^, their 
corresponding ideals I all give the same Hilbert polynomial HPr/j = 3. 
But depending on whether the points are collinear or not, we get different 
resolutions (4.10) and (4.9) and different Hilbert functions, as in part c of 
Exercise 3 and part b of Exercise 4. This is quite typical of what happens. 
Here is a similar but more challenging example. 

Exercise 6. Now consider varieties F = {pi^P 2 iPs^P 4 } in and write 
I = 1(F) C R = k[x, 2 /, z] as above. 

a. First assume the points of F are in general position in the sense that no 
three are collinear. Show that I 2 is 2-dimensional over A;, and that I is 
generated by any two linearly independent elements of / 2 - Deduce that 
a graded resolution of I has the form 

0 R{-4) ^ R{-2f ^ 0, 

and use this to compute for all t. Do you see how the R{—2)^ 

is consistent with Bezout’s Theorem? 
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b. Now assume that three of the points of V lie on a line L C but the 
fourth does not. Show that every element of I 2 is reducible, containing 
as a factor a linear polynomial vanishing on L. Show that I 2 does not 
generate I in this case, and deduce that a graded resolution of I has the 
form 

0 -> R{-3) 0 jR(-4) -> R{-2f © R{-3) 0. 

Use this to compute for all t. 

c. Finally, consider the case where all four of the points are collinear. Show 
that in this case, the graded resolution has the form 

0 R{-5) R{-1) 0 i?(-4) 0, 

and compute the Hilbert function of R/ 1 for all t. 

d. In which cases is U a complete intersection? 

Understanding the geometric significance of the shape of the graded res- 
olution of / = I(U) in more involved examples is an area of active research 
in contemporary algebraic geometry. One of the most tantalizing unsolved 
problems as of this writing is a conjecture made by Mark Green concerning 
the graded resolutions of the ideals of canonical curves. See [Schre2], [EH] 
for some recent work on those and other topics concerning resolutions. Also, 
Section 15.12 of [Eis] has some interesting projects dealing with resolutions, 
and some of the exercises in Section 15.11 are also relevant. 



Parametric Plane Curves 



Here, we will begin with a curve in parametrized by rational functions 



(4.11) 



a{t) b{i) 



where a,b,cE A;[^] are polynomials such that c ^ 0 and GCD(a, 6, c) = 1. 
We also set n = max(dega, degft, degc). Parametrizations of this form 
play an important role in computer-aided geometric design, and a ques- 
tion of particular interest is the implicitization problem^ which asks how 
the equation f{x,y) = 0 of the underlying curve is obtained from the 
parametrization (4.11). An introduction to implicitization can be found in 
Chapter 3 of [CLOj. 

A basic object in this theory is the ideal 



(4.12) I = {c{t)x — a{t), c{t)y — b{t)) C k[x, y, t]. 

This ideal has the following interpretation. Let W C A; be the roots of c{t), 
i.e., the solutions of c{t) = 0. Then we can regard (4.11) as the function 
F :k-W defined by 
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In Exercise 14 at the end of the section, you will see show that the graph 
of F, regarded as a subset of is precisely the variety V(/). Prom here, 
one can prove that the intersection Ji = / fl k[x, y] is an ideal in fc[x, y] 
such that V(/i) C is the smallest variety containing the image of the 
parametrization (4.11) (see Exercise 14). In the terminology of Chapter 2, 
Ii = I n k[x, y] is an elimination ideal, which we can compute using a 
Grobner basis with respect to a suitable monomial order. 

It follows that the ideal / contains a lot of information about the curve 
parametrized by (4.11). Recently, it was discovered (see [SSQK] and [SC]) 
that I provides other parametrizations of the curve, different from (4.11). 
To see how this works, let /(I) denote the subset of I consisting of all 
elements of I of total degee at most 1 in a; and y. Thus 

(4.13) /(I) = {f el: f = A{t)x + B{t)y + C(t)}. 

An element in A{t)x + B{t)y + C{t) e /(I) is called a moving line since 
for t fixed, the equation A{t)x + B{t)y + C{t) = 0 describes a line in the 
plane, and as t moves, so does the line. 

Exercise 7. Given a moving line A{t)x + B{t)y + C{t) e /(I), suppose 
that t e k satisfies c{t) 7^ 0. Then show that the point given by (4.11) lies 
on the line A{t)x 4- B{t)y + C{t) = 0. Hint: Use 7(1) C 7. 

Now suppose that we have moving lines /, ^ G 7(1). Then, for a fixed t, 
we get a pair of lines, which typically intersect in a point. By Exercise 7, 
each of these lines contains (a(t)/c(t), 6(t)/c(t)), so this must be the point 
of intersection. Hence, as we vary t, the intersection the moving lines will 
trace out our curve. 

Notice that our original parametrization (4.11) is given by moving lines, 
since we have the vertical line x = a{t)/c{t) and the horizontal line y = 
b{t)/c{t). However, by allowing more general moving lines, one can get 
polynomials of smaller degree in t. The following exercise gives an example 
of how this can happen. 

Exercise 8. Consider the parametrization 

2^2 + 4t + 5 3^2 + t + 4 

^ ” fi + 2t + Z ’ ^ ~ + 2t + Z' 

a. Prove that p = (5t + 5)x — y — {lOt -1- 7) and q = {5t — b)x — (t + 2)y + 
{—It + 11) are moving lines, i.e., p,q e I, where 7 is as in (4.12). 

b. Prove that p and q generate 7, i.e., 7 = {p,q). 

In Exercise 8, the original parametrization had maximum degree 2 in t, 
while the moving lines p and q have maximum degree 1. This is typical 
of what happens, for we will show below that in general, if n is the max- 
imum degree of a, 6, c, then there are moving lines p,q E I such that p 
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has maximum degree /x < [n/2j in t and q has maximum degree n — fi. 
Furthermore, p and q are actually a basis of the ideal /. In the terminology 
of [CSC], this is the moving line basis or p-basis of the ideal. 

Our goal here is to prove this result — the existence of a /x-basis — and to 
explain what this has to do with graded resolutions and Hilbert functions. 
We begin by studying the subset 7(1) C / defined in (4.13). It is closed 
under addition, and more importantly, 7(1) is closed under multiplication 
by elements of k[t] (be sure you understand why). Hence 7(1) has a natural 
structure as a fc[^]-module. In fact, 7(1) is a syzygy module, which we will 
now show. 

(4.14) Lemma. Let a^b,c e k[t] satisfy c ^ 0 and GCD(a, 6, c) = 1, and 
set I = {cx — a, cy — b). Then, for A,B,C G k[t], 

A(t)x + B{t)y + C{t) e I <==> A{t)a{t) + B{t)b{t) + C(t)c{t) = 0. 

Thus the map A{t)x + B{t)y C{i) i-> (A, B, C) defines an isomorphism 
of k[t]-modules 7(1) ~ Syz (a, 5, c). 

Proof. To prove =^, consider the ring homomorphism k[x, y, t] — > k{t) 
which sends x, y, t to ^ ^ Since the generators of 7 map to zero, 

so does A{t)x + B{t)y + C{t) e 7. Thus A{t) ^ + B(t) ^ + C(t) = 0 
in k(t), and multiplying by c(t) gives the desired equation. 

For the other implication, let 5 = and consider the sequence 

(4.15) ^ 53 4 5 

where a(hi, hs) = (cfti -hbhs, c/12 — —ahi — 6/^2) and j3(A, B, C) = 
Aa + + Cc. One easily checks that j3oa = 0, so that im(a) C ker(j9). It 

is less obvious that (4.15) is exact at the middle term, i.e., im(Q:) = ker(/3). 
This will be proved in Exercise 15 below. The sequence (4.15) is the Koszul 
complex determined by a, 6, c (see Exercise 10 of §2 for another example of 
a Koszul complex). A Koszul complex is not always exact, but Exercise 15 
will show that (4.15) is exact in our case because GCD(a, 6, c) = 1. 

Now suppose that Aa + Bb + Cc = 0. We need to show that Ax 4- 
By -h C G 7. This is now easy, since our assumption on A, B, C implies 
(A, J5, C) G ker{l3). By the exactness of (4.15), (A, B, C) G im(/3), which 
means we can find /xi, /i2> ^3 ^ such that 



A = ch\ 4- 5/i3, B = c/i2 — u/x3, C = —ah\ — 6/x2- 



Hence 

Ax 4- By 4- C = {ch\ 4- bh:f)x 4- (c/x2 - Cih^)y — dhi — 6/12 

= {hi + yh^){cx - a) + (/12 - xh2){cy -b) e I, 

as desired. The final assertion of the lemma now follows immediately. □ 
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(4.16) Definition. Given a parametrization (4.11), we get the ideal I = 
{cx — a^cy — b) and the syzygy module 7(1) from (4.13). Then we define fi 
to the minimal degree in t of a nonzero element in 7(1). 

The following theorem shows the existence of a jL^-basis of the ideal 7. 

(4.17) Theorem. Given (4-^V 'where c ^ 0 and GCD(a, 6, c) = 1, set 
n = max(deg a, deg 6, deg c) and I = {cx — a, cy — b) as usual. If jji is as 
in Definition (4.16), then 

n < [n/2j, 

and we can find p,q G I such that p has degree p in t, q has degree n — p 
in t, and I = {p, q ) . 

Proof. We will study the syzygy module Syz (a, 6, c) using the methods of 
§3. For this purpose, we need to homogenize a, 6, c. Let t, u be homogeneous 
variables and consider the ring R = k[t,u]. Then a{t,u) will denote the 
degree n homogenization of a(t), i.e., 

a(t, u) = u^ a{^) e R 

In this way, we get degree n homogeneous polynomials d,b,c G R, and the 
reader should check that GCD(a, b,c) = l and n = max(deg a, deg 6, deg c) 
imply that a, 6, c have no common zeros in P^. In other words, the only 
solution ofd = b = c = 0ist = u = 0. 

Now let J = {d,b,c) C R = k[t, u]. We first compute the Hilbert poly- 
nomial HPj of J. The key point is that since d = b = c = 0 have only 
one solution, no matter what the field is, the Finiteness Theorem from §2 
of Chapter 2 implies that the quotient ring R/ J = k[t,u]/ J is sl finite di- 
mensional vector space over k. But J is a homogeneous ideal, which means 
that i?/J is a graded ring. In order for S/J to have finite dimension, we 
must have dimk{R/J)s = 0 for all s sufliciently large (we use s instead of 
t since t is now one of our variables). It follows that HPji/j is the zero 
polynomial. Then the exact sequence 

0 — ^ J — ^ R — ^ R/ J — ^ 0 

and Exercise 2 imply that 

(4.18) HPj{s) = HPr{s) = (^ + ^) = s + 1 
since R = k\t, vi\. For future reference, note also that by (4.2), 

HPR(-d){s) = (^ “ ^ + ^) = 5 - d + 1. 

Now consider the exact sequence 

0 — > Syz (a, 6, c) R{—n)^ J — > 0, 
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where a{A,B,C) = Ad + J56 + Cc. By Proposition (3.20), the syzygy 
module Syz (d, b, c) is free, which means that we get a graded resolution 

(4.19) 0 ^ R{-di) e • • • © R{-dm) ^ R{-nf A J ^ 0. 

for some di, . . . , drn* By Exercise 2, the Hilbert polynomial of the middle 
term is the sum of the other two Hilbert polynomials. Since we know HPj 
from (4.18), we obtain 

3(s — 71 -f- 1) = (s — di + 1) + • * • + (5 — dm + 1) + (5 + 1) 

= {tti + l)s + 777 . + I — d\ — ••• — dm- 

It follows that m = 2 and 3n = di + ^ 2 - Thus (4.19) becomes 

(4.20) 0 ^ R{-di) © R{-d 2 ) 4 R{-nf A J -> 0. 

The matrix L representing is a 3 x 2 matrix 

( Pi qi\ 

P2 92 1, 

P3 qz) 

and since (3 has degree zero, the first column of L consists of homogeneous 
polynomials of degree /xi = di — n and the second column has degree 
1^2 =: d 2 — n. Then //i + /i 2 = follows from 3n = di + ^ 2 . 

We may assume that jii < /X 2 - Since the first column (pi, P 2 ? Ps) of (4.21) 
satisfies pid + p 2 & + Psc = 0, setting u = 1 gives 

pi(t, l)a(^) +P2(^, l)b{t) +ps(^, l)c(0 = 0. 

Thusp = pi{t, l)x-hP 2 {ti l)p+P 3 (t, 1) ^ ^(1) by Lemma (4.14). Similarly, 
the second column of (4.21) gives q = qi{t, l)x-\-q 2 {t, ^)y3-qs{t, 1) E /(I). 
We will show that p and q satisfy the conditions of the theorem. 

First observe that the columns of L generate Syz (d, 6, c) by exactness. 
In Exercise 16, you will show this implies that p and q generate 7(1). Since 
cx — a and cy — b are in 7(1), we obtain 7 = {cx — a, cy — b) C (p, q). The 
other inclusion is immediate from p, g E 7(1) C 7, and 7 = (p, q) follows. 

The next step is to prove pi = p. We begin by showing that p has degree 
Pi in t. This follows because Pi(t^u),p 2 {t,u),ps{t^u) are homogeneous of 
degree pi. If the degree of all three were to drop when we set u = 1, then 
each Pi would be divisible by u. However, since Pi,P 2 ?P 3 give a syzygy 
on d,b,c, so would P\ju^p 2 ju^P‘^lu. Hence we would have a syzygy of 
degree < pi. But the columns of L generate the syzygy module, so this is 
impossible since pi < P 2 - Hence p has degree pi in t, and then p < pi 
follows from the definition of p. However, if p < pi, then we would have 
Ax + By + C E 7(1) of degree < pi. This gives a syzygy of a^b,c, and 
homogenizing, we would get a syzygy of degree < pi among d, 6, c. As we 
saw earlier in the paragraph, this is impossible. 
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We conclude that p has degree p in and then /ii + ^2 = ^ implies that 
q has degree /i 2 = n — // in Finally, p < \n/2\ follows from p = pi < P 2 ^ 
and the proof of the theorem is complete. □ 

As already mentioned, the basis p,q constructed in Theorem (4.17) is 
called a p-basis of I. One property of the /x-basis is that it can be used to 
find the implicit equation of the parametrization (4.11). Here is an example 
of how this works. 

Exercise 9. The parametrization studied in Exercise 8 gives the ideal 
I = ((*2 + 2t + S)x - {2t^ +At + 5), + 2t + 3)y - {3^ + t + 4)). 

a. Use Grobner basis methods to find the intersection I C\k[x,y]. This gives 
the implicit equation of the curve. 

b. Show that the resultant of the generators of I with respect to t gives 
the implicit equation. 

c. Verify that the polynomials p = (5t -f h)x ~ y — (lOt + 7) and q = 
(5t — 5)x — {t-\- 2)y 4- {—7t + 11) are a /i-basis for /. Thus p = I, which 
is the biggest possible value of p (since n = 2). 

d. Show that the resultant of p and q also gives the implicit equation. 

Parts b and d of Exercise 9 express the implicit equation as a resultant. 
However, if we use the Sylvester determinant, then part b uses a 4 x 4 
determinant, while part d uses a 2 x 2 determinant. So the //-basis gives a 
smaller expression for the resultant. In general, one can show (see [CSC]) 
that for the //-basis, the resultant can be expressed as a (n — //) x (n — //) 
determinant. Unfortunately, it can also happen that this method gives a 
power of the actual implicit equation (see Section 4 of [CSC]). 

Earlier in this section, we considered the ideal of three points in P^. 
We found that although all such ideals have the same Hilbert polynomial, 
we can distinguish the collinear and noncollinear cases using the Hilbert 
function. The situation is similar when dealing with //-bases. Here, we have 
the ideal J = (a, 6, c) C R = fc[t, u] from the proof of Theorem (4.17). In 
the following exercise you will compute the Hilbert function of R/ J. 

Exercise 10. Let J = {a^b^c) be as in the proof of Theorem (4.17). In 
the course of the proof, we showed that the Hilbert polynomial of R/ J is 
the zero polynomial. But what about the Hilbert function? 
a. Prove that the Hilbert function is given by 

{ s + l ifO<s<n— 1 

3n — 2s — 2 ifn<s<n-h// — 1 
2n — s — p — \ \{n-\-p<s<2n — p— 1 

0 \i 2n — p < s. 



Hr,j{s) = 
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b. Show that the largest value of s such that Hjiij{s) ^ 0 is s = 2n — ^ — 2, 
and conclude that knowning /i is equivalent to knowing the Hilbert 
function of quotient ring R/J. 

c. Compute the dimension of R/ J as a. vector space over k. 

In the case of the ideal of three points, note that the noncollinear case 
is generic. This is true in the naive sense that one expects three randomly 
chosen points to be noncollinear, and this can be made more precise using 
the notion of generic given in Definition (5.6) of Chapter 3. Similarly, for 
/i-bases, there is a generic case. One can show (see [CSC]) that among 
parametrizations (4.11) with n = max(deg a, deg 6, deg c), the “generic” 
parametrization has // = [n/2j , the biggest possible value. More generally, 
one can compute the dimension of the set of all parametrizations with a 
given IX. This dimension decreases as fx decreases, so that the smaller the 
fjL, the more special the parametrization. 

We should also mention that the Hilbert-Burch Theorem discussed in §2 
has the following nice application to /x-bases. 

(4.22) Proposition. The fx-basis coming from the columns of (4-21) can 
be chosen such that 

a = P2Q3 - Psq2, b = -{piQs - PsQl), C = PiQ2 - PIQ2- 

Dehomogenizing, this means that a, 6, c can be computed from the coeffi- 
cients of the p-basis 

. . P = Pi{t, 1)^ + P 2 {t, l)y + P3{t, 1) 

q = qi{t, l)x + q2{t, l)y + qsit, 1 ). 

Proof. To see why this is true, first note that the exact sequence (4.20) 
has the form required by Proposition (2.6) of §2. Then the proposition 
implies that if /i, / 2 , /a are the 2x2 minors of (4.21) (this is the notation 
of Proposition (2.6)), then there is a polynomial g G k[t,u] such that 
a = gfi, b = gf 2 , c = gf^. However, since d, 6, c have no common roots, 
g must be a nonzero constant. If we replace pi with gpi^ we get a /x-basis 
with the desired properties. □ 



Exercise 11. Verify that the /x-basis studied in Exercises 8 and 9 satisfies 
Proposition (4.24) after changing p by a suitable constant. 



It is also possible to generalize Theorem (4.17) by considering curves in 
m-dimensional space k^ given parametrically by 



(4.24) 



Xi 



ai{t) 
c(t) ’ 



(^) 

c{t) 



where c ^ 0 and GCD(oi, . . . , Om) = 1- In this situation, the syzygy 
module Syz (oi, . . . , Um, c) and its homogenization play an important role, 
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and the analog of the /i-basis (4.13) consists of m polynomials 

(4.25) Pj — Pij(t^ 1)^1 “h * * * "f" PmjiP^ l)^m "h jPm+lj(^j 1)> 1 ^ J ^ 

which form a basis for the ideal I = {cxi — ai, . . . , cxm — Cim)- If we fix t in 

(4.25) , then the equation pj = 0 is a hyperplane in so that as t varies, 
we get a moving hyperplane. One can prove that the common intersection 
of the m hyperplanes Pj == 0 sweeps out the given curve and that if pj has 
degree pj in t, then /xi + • • • + Prn = Thus we have a m-dimensional 
version of Theorem (4.17). See Exercise 17 for the proof. 

We can use the Hilbert-Burch Theorem to generalize Proposition (4.22) 
to the more general situation of (4.24). The result is that up to sign, the 
polynomials ai, . . . , a^, c are the m x m minors of the matrix {pij{t, 1)) 
coming from (4.25). Note that since pj has degree pj in t, the mxm minors 

{Pij{t^ 1)) have degree at most p\-\ h pm = nmt. So the degrees work 

out nicely. The details will be covered in Exercise 17 below. 

The proof given of Theorem (4.17) makes nice use of the results of §3, 
especially Proposition (3.20), and the generalization (4.24) to curves in 
shows just how powerful these methods are. The heart of what we did 
in Theorem (4.17) was to understand the structure of the syzygy module 
Syz (a, 6, c) as a free module, and for the m-dimensional case, one needs 
to understand Syz (ai, . . . , a^, c) for ai, . . . , c € A:[t, u]. Actually, in 
the special case of Theorem (4.17), one can give a proof using elementary 
methods which don’t require the Hilbert Syzygy Theorem. One such proof 
can be found in [CSC], and another was given by Franz Meyer, all the way 
back in 1887 [Mey]. 

Meyer’s article is interesting, for it starts with a problem completely 
different from plane curves, but just as happened to us, he ended up with 
a syzygy problem. He also considered the more general syzygy module 
Syz (ai, . . . , c), and he conjectured that this was a free module with 

generators of degrees /ii, . . . , /x^ satisfying pi~\ f- pm = n. But in spite 

of many examples in support of this conjecture, his attempts at a proof 
“ran into difficulties which I have at this time not been able to overcome” 
[Mey, p. 73]. However, three years later, Hilbert proved everything in his 
groundbreaking paper [Hil] on syzygies. For us, it is interesting to note 
that after proving his Syzygy Theorem, Hilbert’s first application is to 
prove Meyer’s conjecture. He does this by computing a Hilbert polynomial 
(which he calls the characteristic function) in a manner remarkably similar 
to what we did in Theorem (4.17) — see [Hil, p. 516]. Hilbert then concludes 
with the Hilbert-Burch Theorem in the special case of k[t, u]. 

Rings of Invariants 

The final topic we will explore is the invariant theory of finite groups. 
In contrast to the previous discussions, our presentation will not be self- 
contained. Instead, we will assume that the reader is familiar with the 
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material presented in Chapter 7 of [CLO]. Our goal is to explain how graded 
resolutions can be used when working with polynomials invariant under a 
finite matrix group. 

For simplicity, we will work over the polynomial ring S = C[a:i, . . . , Xm]- 
Suppose that G C GL(m, C) is a finite group. If we regard g e G as giving 
a change of coordinates on then substituting this coordinate change 
into f £ S = C[xi, . . . ,Xm] gives another polynomial g • f £ S, Then 
define 



= {f £ C[xi, . . . , : ^ • / = / for all g £ G}. 

Intuitively, consists of all polynomials f £ S which are unchanged (i.e., 
invariant) under all of the coordinate changes coming from elements g £ G. 
The set has the following structure: 

• (Graded Subring) The set of invariants 5^ C 5 is a subring of S, meaning 
that S is closed under addition and multiplication by elements of S^. 
Also, if / £ 5^, then every homogeneous component of / also lies in S^. 

(See Propositions 9 and 10 of Chapter 7, §2 of [CLO].) We say that is 
a graded subring of S. Hence the degree t homogeneous part consists 
of all invariants which are homogeneous polynomials of degree t. Note that 
is not an ideal of S. 

In this situation, we define the Molien series of to be the formal 
power series 

oo 

(4.26) Fg{u) = ^dimc(S'f)'u‘. 

t=0 

Molien series are important objects in the invariant theory of finite groups. 
We will see that they have a nice relation to Hilbert functions and graded 
resolutions. 

A basic result proved in Chapter 7, §3 of [CLO] is: 

• (Finite Generation of Invariants) For a finite group G C GL(m, C), there 
are /i, . . . , /s G 5^ such that every / G 5^ is a polynomial in /i, . . . , /«. 
Furthermore, we can assume that /i , . . . , are homogeneous. 

This enables us to regard 5^ as a module over a polynomial ring as follows. 
Let /i, . . . , /s be homogeneous generators of the ring of invariants 5^, and 
set di = deg fi. Then introduce variables t/i, . . . , 2/s and consider the ring 
R = C[yi, ...,2/s]. The ring R is useful because the map sending yi to fi 
defines a ring homomorphism 

(p: R = C[yi, . . . , ys] — > 

which is onto since every invariant is a polynomial in /i , . . . , /^ . An impor- 
tant observation is that (p becomes a graded homomorphism of degree zero 
provided we regard the variable yi as having degree di = deg fi. Previously, 
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the variables in a polynomial ring always had degree 1, but here we will 
see that having deg yi — di is useful. 

The kernel / = ker<^ C R consists of all polynomial relations among 
the fi. Since ip is onto, we get an isomorphism R/I . Regarding 

as an ii-module via Vi • f — fif for / G R/I ~ is an 
isomorphism of ii-modules. Elements of I are called syzygies among the 
invariants /i, . . . , /g. (Historically, syzygies were first defined in invariant 
theory, and only later was this term used in module theory, where the 
meaning is slightly different). 

For going any further, let’s pause for an example. Consider the group 
G = {e,g,g^,g^} C GL(2,C), where 

( 4 . 27 ) 9 =(; 

The group G acts on f e S = C[o;i, X 2 ] via g • /(x, y) = f{—y^ x). Then, 
as shown in Example 4 of Chapter 7, §3 of [CLO], the ring of invariants 
is generated by the three polynomials 

(4.28) /i = a;? + xl, h = h = ^ 1^2 ~ xix^. 

This gives (p : R = C[yi, 1 / 2 , J/a] ^ where (p{yi) = fi- Note that j /2 has 

degree 2 and t/ 2 , ys both have degree 4. One can also show that the kernel 
of is / = ( 2/3 — 2 / 1 2/2 + 42 / 2 )- This means that all syzygies are generated 
by the single relation /| - /i /2 + 4/| = 0 among the invariants (4.28). 

Returning to our general discussion, the i?-module structure on shows 
that the Molien series (4.26) is built from the Hilbert function of the R- 
module S^. This is immediate because 

dimc(Sf) = HsG{t). 

In Exercises 24 and 25, we will see more generally that any finitely 
generated ii-module has a Hilbert series 

00 

t=—oo 

The basic idea is that one can compute any Hilbert series using a graded 
resolution of M. In the case when all of the variables have degree 1, this is 
explained in Exercise 24. 

However, we are in a situation where the variables have degree deg yi = 
di (sometimes called the weight of yi). Formula (4.2) no longer applies, so 
instead we use the key fact (to be proved in Exercise 25) that the Hilbert 
series of the weighted polynomial ring R = C[2/i, . . . , 2/s] is 

00 CX 3 .. 

(4.29) = (1 - Mrfi) ■ ■ ■ (1 - • 

t=0 t=0 \ J \ J 
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Furthermore, if we define the twisted free module R{—d) in the usual way, 

then one easily obtains 

oo ^ 

(4.30) g 

(see Exercise 25 for the details). 

Let us see how this works in the example begun earlier. 

Exercise 12. Consider the group G C GL(2, C) given in (4.27) with 

invariants (4.28) and syzygy /| -h /f /2 + 4/| = 0. 

a. Show that a minimal free resolution of as a graded i?-module is given 
by 

0 — > R{-8) —*0 

where 'ip is the map represented by the 1x1 matrix (2/3 + 1/12/2 + 42/1). 

b. Use part a together with (4.29) and (4.30) to show that the Molien series 
of G is given by 

1-u^ ^ 1 + w^ 

_ ^4^2 - (1 _ ^2)(l _ ^4) 

= l + u^ + 3u^ + 3u^ + 5u^ + + • • • 

c. The coefficient 1 of v? tells us that we have a unique (up to constant) 
invariant of degree 2, namely /i. Furthermore, the coefficient 3 of 
tells us that besides the obvious degree 4 invariant /f , we must have two 
others, namely /2 and fs. Give similar explanations for the coefficients 
of and and in particular explain where how the coefficient of 
proves that we must have a nontrivial syzygy of degree 8. 



In general, one can show that if the invariant ring of a finite group G is 
generated by homogeneous invariants /i, . . . , of degree di, . . . , dg, then 
the Molien series of G has the form 



Fg{u) = 



pju) 

(1 — -u^i) •••(! — u^«) 



for some polynomial P{u). See Exercise 25 for the proof. As explained in 
[Sta2], P{u) has the following intuitive meaning. If there are no nontrivial 
syzygies between the fi, then the Molien series would have been 

1 

(1 — u^i) • • • (1 — ’ 



Had been generated by homogenous elements of degrees 

di, . . . , ds, with homogeneous syzygies 5i, . . . , 5^ of degrees /3i,. . , ,/3w 
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and no second syzygies, then the Molien series would be corrected to 

1 - 

iii(i - ' 

In general, by the Syzygy Theorem, we get 

fg ( u ) = (1 - - • • o/aci - ^‘'0- 

" V ' 

at most s sums 

Our treatment of invariant theory has yet to mention some of the most 
important components of the theory. For example, we haven’t discussed 
Molten’ s Theorem^ which states that the Molien series (4.26) of a finite 
group G C GL(m, C) is given by the formula 

Fciu) = jq E det(/-«5) 

where |G| is the number of elements in G and I G GL(m, C) is the identity 
matrix. This theorem is why (4.26) is called a Molien series. The importance 
of Molien’s theorem is that it allows one to compute the Molien series in 
advance. As shown by part c of Exercise 12, the Molien series can predict 
the existence of certain invariants and sygygies, which is useful from a 
computational point of view (see Section 2.2 of [Stul]). 

A second crucial aspect we’ve omitted is that the ring of invariants 
is Cohen-Macaulay. This has some far-reaching consequences for the in- 
variant theory. For example, being Cohen-Macaulay predicts that there are 
algebaically independent invariants 0i, ... ,6r such that the invariant ring 
is a free module over the polynomial ring C[0i, . . . , 0r]« For example, 
in the invariant ring = C[/i, / 2 , fs] considered in Exercise 12, one can 
show that as a module over C[/i, / 2 ], 

= C[/i, /2] 0 /sC[/i, /2]. 

(Do you see how the syzygy /| — /i/| + 4/| = 0 enables us to get rid 
of terms involving /| , /| , etc?) This has some strong implications for the 
Molien series, as explained in [Sta2] or [Stul]. 

Hence, to really understand the invariant theory of finite groups, one 
needs to combine the free resolutions discussed here with a variety of 
other tools, some of which are more sophisticated (such as Cohen-Macaulay 
rings). Fortunately, some excellent expositions are available in the litera- 
ture, and we especially recommend [Sta2] and [Stul]. Additional references 
are mentioned at the end of Chapter 7, §3 of [CLO]. 

This brings us to the end of our discussion of resolutions. The exam- 
ples presented in this section — ideals of three points, //-bases, and Molien 
series — are merely the beginning of a wonderful collection of topics related 
to the geometry of free resolutions. When combined with the elegance of 
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the algebra involved, it becomes clear why the study of free resolutions is 
one of the richer area of contemporary algebraic geometry. 

To learn more about free resolutions, we suggest the references [Eis], 
[Schre2] and [EH] mentioned earlier in the section. The reader may also 
wish to consult [BH, Chapter 4] for a careful study of Hilbert functions. 



Additional Exercises for §4 

Exercise 13. The Hilbert polynomial has the property that HM{t) — 
HPM{t) for all t sufiiciently large. In this exercise, you will derive an explicit 
bound on how large t has to be in terms of a graded resolution of M. 

a. Equation (4.3) shows that the binomial coefficient is given by a 

polynomial of degree n in t. Show that this identity holds for all t > —n 
and also explain why it fails to hold when t = —n — 1. 

b. For a twisted free module M = R{—di) 0 • • • 0 R{—dm), show that 
HM{t) = HPM{t) holds for t > maiXi{di — n). 

c. Now suppose we have a graded resolution • • • -> Fq — > M where 
Fj = 0iF(-dij). Then show that HM{t) = HPuif) holds for all 
t > max.ij{dij — n). 

d. For the ideal I C k[x, y, z, w] from (4.5), we found the graded resolution 

0 -> F(-3)2 R{-2f R^ R/I -^0. 

Use this and part c to show that = HPji/i{t) for all t > 0. 

How does this relate to (4.6)? 

Exercise 14. Given a parametrization as in (4.11), we get the ideal I = 
(c{t)x — a{t), c{i)y — b{t)) C k[x, y, t]. We will assume GCD(a, 6, c) = 1. 

a. Show that V(/) C k^ is the graph of the function F : k — W — > 
defined by F(^) = {a{t), /c{t),b{t)/ c{t))^ where W = {t e k : c{t) = 0}. 

b. If 7i = / n k[x^y], prove that V(/i) C is the smallest variety con- 
taining the parametrization (4.11). Hint: This follows by adapting the 
proof of Theorem 1 of Chapter 3, §3 of [CLOj. 

Exercise 15. This exercise concerns the Koszul complex used in the proof 
of Proposition (4.14). 

a. Assuming GCD(a, 5, c) = 1 in 5 = k[t], prove that the sequence (4.15) 
is exact at its middle term. Hint: Our hypothesis implies that there 
are polynomials p, g, r G k[t] such that pa + qb rc = 1. Then if 
(A, B,C) e ker(/?), note that 

A = paA -f qbA -h rcA 

= p{—bB — cC) + qbA + rcA 
= c{-pC + rA) + b{-pB -h qA). 
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b. Using Exercise 10 of §2 as a model, show how to extend (4.15) to the 
full Koszul complex 

0 ^ 5 5^ A 5^ 4 5 ^ 0. 

of a, 6, c. Also, when GCD(a, 5, c) = 1, prove that the entire sequence 
is exact. 

c. More generally, show that ai, . . . , flrn ^ give a Koszul complex and 
prove that it is exact when GCD(ai, . . . , am) = 1- (This is a challenging 
exercise.) 

Exercise 16. In the proof of Theorem (4.17), we noted that the columns 
of the matrix (4.20) generate the syzygy module Syz (a, 6, c). If we define 

р, q using (4.23), then prove that p, q generate 7(1). 

Exercise 17. In this exercise, you will study the m-dimensional version 
of Theorem (4.17). Thus we assume that we have a parametrization (4.24) 
of a curve in such that c 7^ 0 and GCD(ai , , , . ,am) = 1- Also let 

I = {cX\ (2>i, . . . , CXm ^m) C! k[x\^ • • • j 

and define 

7(1) = {/ ^ -f • / = Ai{t)xi + • • • 4- Am{t)Xm + C'(^)}- 

a. Prove the analog of Lemma (4.14), i.e., show that there is a natural 
isomorphism 7(1) Syz (ai, . . . , Um, c). Hint: You will use part c of 
Exercise 15. 

b. If n = max(deg ai, . . . , deg Um, c) and ai^c E R = k[tyu] are the degree 
n homogenizations of ai, c, then explain why there is an injective map 

0 : R(-di) © • • ■ © R{-ds) 
whose image is Syz (oi, . . . , 0^, c). 

с. Use Hilbert polynomials to show that s = m and that + • • • + dm = 
(m + l)n. 

d. If L is the matrix representing /?, show that the jth column of L consists 
of homogeneous polynomials of degree fij — dj — n. Also explain why 
Pi 4 • • • 4 Ps = n. 

e. Finally, by dehomogenizing the entries of the jth column of L, show that 
we get the polynomial pj as in (4.25), and prove that 7 = (pi, . . . ,Pm)- 

f. Use the Hilbert-Burch Theorem to show that if pi is modified by a 

suitable constant, then up to a constant, ai, . . . , am, c are the m x m 
minors of the matrix 1)) coming from (4.25). 

Exercise 18. Compute the Hilbert function and Hilbert polynomial of the 
ideal of the rational quartic curve in whose graded resolution is given in 
(3.6). What does the Hilbert polynomial tell you about the dimension and 
the degree? 
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Exercise 19. In k[xQ ^ . . . , a;^], n > 2, consider the homogeneous ideal In 
defined by the determinants of the (2)2x2 submatrices of the 2 x n matrix 

^ * * * ^n—l ^ 

~ \Xi X 2 - • Xn J ' 

(We studied this ideal in Exercise 15 of §3.) Compute the Hilbert functions 
and Hilbert polynomials of I 4 and Also determine the degrees of the 
curves V(/4) and V(/5) and verify that they have dimension 1. Hint: In 
part b of Exercise 15 of §3, you computed graded resolutions of these two 
ideals. 

Exercise 20. In this exercise, we will show how the construction of the 
rational normal curves from the previous exercise and Exercise 15 of §3 
relates to the moving lines considered in this section. 

a. Show that for each {t, u) G P^, the intersection of the lines V {txQ + uxi) 
and Y{txi -|- UX 2 ) lies on the conic section V(xoa;2 — x\) in P^. Express 
the equation of the conic as a 2 x 2 determinant. 

b. Generalizing part a, show that for all n > 2, if we construct n moving 
hyperplanes Hi{t^ u) — \{txi-i + uXi) for i = 1 , . . . , n, then for each 
(^, u) in P^, the intersection Hi{t,u) H • • • H Hn{t, u) is a point on the 
standard rational normal curve in P*^ given as in Exercise 15 of §3, and 
show how the determinantal equations follow from this observation. 

Exercise 21. In A:[xo, . . . , Xn\^ n > 3, consider the homogeneous ideal Jn 
defined by the determinants of the (^^2 2x2 submatrices of the 2 x (n — 1) 

matrix 

jy { ^0 X 2 • * * Xn—\ \ 

~ V ^3 • • * Xn ) ' 

The varieties V(Jn) are surfaces called rational normal scrolls in P^. For 
instance, J3 = (xqXs — X 1 X 2 ) is the ideal of a smooth quadric surface in 
p2. 

a. Find a graded resolution of J4 and compute its Hilbert function and 
Hilbert polynomial. Check that the dimension is 2 and compute the 
degree of the surface. 

b. Do the same for J5. 

Exercise 22. The (degree 2) Veronese surface F C P^ is the image of the 
mapping given in homogeneous coordinates by 

: p2 p5 

(xo,Xi,X 2 ) ^ {xl,xl,xl,XoXi,XoX 2 ,XiX 2 ). 
a. Compute the homogeneous ideal I = 1{V) C fe[xo, . . . , x^]. 
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b. Find a graded resolution of I and compute its Hilbert function and 
Hilbert polynomial. Also check that the dimension and the degree are 
both equal to 2. 

Exercise 23. Letpi = (0,0, l),p2 = (1,0, l),ps = (0, 1, 1),P4 = (1, 1, 1) 
in P^, and let I = I({pi, P2, Pa, P4}) be the homogeneous ideal of the variety 
{Pi,P2,P3,P4> in R = fc[xo,Xi,X2]. 

a. Show that /s (the degree 3 graded piece of I) has dimension exactly 6. 

b. Let /o, • • • , /s be any vector space basis for /a, and consider the rational 

mapping (p : > given in homogeneous coordinates by 

(p{Xo,Xi,X 2 ) = (2/0, ... ,2/5) = ifo{Xo,Xi,X 2 ), . . • ,h{Xo,Xi,X 2 )). 

Find the homogeneous ideal J of the image variety of ip. 

c. Show that J has a graded resolution as an S' = fc[2/o, . . . , 2/5]-niodule of 
the form 

0 -> 5(-5) ^ 5(-3)® 4 S{-2f J 0. 

d. Use the resolution above to compute the Hilbert function of J. 

The variety V = V(J) = <p(P^) is called a quintic del Pezzo surface^ and 
the resolution given in part d has some other interesting properties. For 
instance, if the ideal basis for J is ordered in the right way and signs are 
adjusted appropriately, then A is skew-symmetric, and the determinants of 
the 4x4 submatrices obtained by deleting row i and column i (i = 1, . . . , 5) 
are the squares of the generators of J. This is a reflection of a remarkable 
structure on the resolutions of Gorenstein codimension 3 ideals proved by 
Buchsbaum and Eisenbud. See [BE]. 



Exercise 24. One convenient way to “package” the Hilbert function Hm 
for a graded module M is to consider its generating function^ the formal 
power series 

00 

H{M,u)= Y, HM(t)uK 

t=—oo 



We will call JT(M, u) the Hilbert series for M. 
a. Show that for M = i? = k[xQ ^ . . . , Xn]^ we have 



H{R, u) 




= 1/(1 - 



where the second equality comes from the formal geometric series 
identity 1/(1 — u) = induction on n. 

b. Show that if i? = k[xo ^ . . . , Xn] and 



M = R{-di) 0 • • • © R{-dm) 
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is one of the twisted graded free modules over JR, then 
H{M, u) = + • • • + 

c. Let I be the ideal of the twisted cubic in studied in Exercise 2 of §2, 
and let R = k[x, y, 2 ;, w]. Find the Hilbert series H{R/I, u), 

d. Using part b and Theorem (4.4) deduce that the Hilbert series of any 
graded fc[xo, . . . , XnJ-niodule M can be written in the form 

H{M,u) = P(w)/(1 - 

where P is a polynomial in u with coefBcients in Z. 

Exercise 25. Consider the polynomial ring R = fc[yi, . . . , y^], where y^ 
has weight or degree deg = di > 0. Then a monomial y“^ • • • y“ ^ has 

(weighted) degree t = diUi H dgag- This gives a grading on R such that 

Rt is the set of fc-linear combinations of monomials of degree t. 

a. Prove that the Hilbert series of R is given by 

00 ^ 

Hint: 1/(1 — When these series are multiplied to- 

gether for 2 = 1, . . . , s, do you see how each monomial of weighted 
degree t contributes to the coefficient of 

b. Explain how part a relates to part a of Exercise 24. 

c. If R{—d) is defined by R{-d)t = Rt-d^ then prove (4.30). 

d. Generalize parts b, c and d of Exercise 24 to P = fc[yi, . . . , y«]. 

Exercise 26. Suppose that a, 6, c G k[t] have maximum degree 6. As 
usual, we will assume c ^ 0 and GCD(a, 6, c) = 1. 

a. If a = b — and c = — t — show 

that /X = 2 and find a /x-basis. 

b. Find an example where /x = 3 and compute a /x-basis for your example. 
Hint: This is the generic case. 

Exercise 27. Compute the Molien series for the following finite matrix 
groups in GL(2, C). In each case, the ring of invariants C[xi, ^ 2 ]^ can be 
computed by the methods of Chapter 7, §3 of [CLO]. 

a. The Klein four-group generated by ^ q ^ and 

b. The two-element group generated by y = 

c. The four-element group generated by y = 






Chapter 7 

Polytopes, Resultants, and 
Equations 



In this chapter we will examine some interesting recently-discovered con- 
nections between polynomials, resultants, and the geometry of the convex 
polytopes determined by the exponent vectors of the monomials appearing 
in polynomials. 



§1 Geometry of Poly topes 

A set C in MT' is said to be convex if it contains the line segment connecting 
any two points in C. If a set is not itself convex, its convex hull is the 
smallest convex set containing it. We will use the notation Conv(5) to 
denote the convex hull of 5 C 

More explicitly, all the points in Conv(S) may be obtained by forming 
a particular set of linear combinations of the elements in S. In Exercise 1 
below, you will prove the following proposition. 

(1.1) Proposition. Let S be a subset ofW'. Then 

Conv(S) = {AiSi + • * • + ’ Si ^ S, Xi ^ 0, ~ ^}* 

Linear combinations of the form AiSi + • • • + XmSm^ where Si G S, 
Xi > 0, and YliLi Ai = 1 are called convex combinations. 

Exercise 1. 

a. Show that if 5 = {si, S 2 } then the set of convex combinations is the 
straight line segment from si to S 2 in W^. Deduce that Proposition (1.1) 
holds in this case. 

b. Using part a, show that the set of all convex combinations 

{AiSi + • • • + XmSm ' Si E S, Xi > 0, ~ ^}* 

is a convex subset of for every S. Also show that this set contains S. 



290 




§1. Geometry of Polytopes 291 



c. Show that if C is any convex set containing 5, then C also contains the 
set of part b. Hint: One way is to use induction on the number of terms 
in the sum. 

d. Deduce Proposition (1.1) from parts b and c. 

By definition, a poly tope is the convex hull of a finite set in W^. If the 
finite set is ^ = {^i? • • • , C then the corresponding polytope can 
be expressed as 

Conv(^) = {Aimi H h Xiuii : Xi > 0, = !}• 

In low dimensions, poly topes are familiar figures from geometry: 

• A poly tope in R is a line segment. 

• A polytope in R^ is a line segment or a convex polygon. 

• A poly tope in R^ is a line segment, a convex polygon lying in a plane, 
or a three-dimensional polyhedron. 

As these examples suggest, every polytope Q has a well-defined dimension, 
A careful definition of dimQ will be given in the exercises at the end of 
the section. For more background on convex sets and polytopes, the reader 
should consult [Zie]. Fig. 7.1 below shows a three-dimensional polytope. 

For another example, let A = {(0, 0), (2, 0), (0, 5), (1, 1)} C R^. Here, 
Conv(^) is the triangle with vertices (0, 0), (2, 0), and (0, 5) since 

(1.2) (1,1) = ^(0,0)+ i (2,0)+ 1(0,5) 

is a convex combination of the other three points in A. 

For us, the most important poly topes will be convex hulls of sets of points 
with integer coordinates. These are sometimes called lattice polytopes. Thus 
a lattice polytope is a set of the form Conv(^), where ^ C is finite. An 
example of special interest to us is when A consists of all exponent vectors 




Figure 7.1. A Three-Dimensional Polytope 
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appearing in a collection of monomials. The polytope Q = Conv(>l) will 
play a very important role in this chapter. 

Exercise 2. Let Ad = {^n £ ZJo : |m| < d} be the set of exponent 
vectors of all monomials of total degree at most d. 

a. Show that the convex hull of Ad is the polytope 

Qd = {(ai, . . . , ttn) G M"" : fli > 0, < d]. 

Draw a picture of Ad and Qd when n = 1, 2, 3 and d = 1, 2, 3. 

b. A simplex is defined to be the convex hull of n + 1 points mi , . . . , mn+i 
such that rri 2 — mi, ... , mn+i — mi are a basis of Show that the 
polytope Qd of part a is a simplex. 

A polytope Q C has an n-dimensional volume, which is denoted 
Voln(Q). For example, a polygon Q in has Vol 2 (Q) > 0, but if we 
regard Q as lying in the x^z-plane in R^, then Vol 3 (Q) = 0. 

From multivariable calculus, we have 

VoIti(^) = J* * • * J* 1 dx\ • • • dXfiy 

where x\, , . . ,Xn are coordinates on R^. Note that Q has positive volume 
if and only if it is n-dimensional. A simple example is the unit cube in R’^, 
which is defined by 0 < < 1 for all i and clearly has volume 1. 

Exercise 3. Let’s compute the volume of the simplex Qd from Exercise 2. 

a. Prove that the map 0 : R’^ — » R’^ defined by 

. . . ,Xn) = {l-Xi,Xi{l-X2),XiX2{l-Xs), . . . , Xr • -Xn-lil-Xn)) 

maps the unit cube C C R^ defined by 0 < < 1 to the simplex Qi. 

Hint: Use a telescoping sum to show (j){C) C Q\. Be sure to prove the 
opposite inclusion. 

b. Use part a and the change of variables formula for n-dimensional 
integrals to show that 

Voln(Ql) = j ' J • • • ^n-1 dxi • • • dXn = ^ • 

c. Conclude that Voln(Qd) = d^ jn\. 

Polytopes have special subsets called its /aces. For example, a 3- 
dimensional polytope in R^ has: 

• faces, which are polygons lying in planes, 

• edges, which are line segments connecting certain pairs of vertices, and 

• vertices, which are points. 

In the general theory, all of these will be called faces. To define a face 
of an arbitrary polytope Q C R’^, let i/ be a nonzero vector in R^. An 
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affine hyperplane is defined by an equation of the form m • z/ == —a (the 
minus sign simplifies certain formulas in §3 and §4 — see Exercise 3 of §3 
and Proposition (4.6)). If 

(1.3 ) ®q(^) — ~ min(m • z/), 

meQ 

then we call the equation 

m ' u ~ — ag(z/) 

a supporting hyperplane of Q, and we call v an inward pointing normal. 
Fig. 7.2 below shows a polytope Q C.M? with two supporting hyperplanes 
(lines in this case) and their inward pointing normals. 

In Exercise 13 at the end of the section, you will show that a supporting 
hyperplane has the property that 

Qj, = Q n {m e : rn • u = -ag(z/)} 7^ 0, 

and, furthermore, Q lies in the half-space 

Q C {m G : m • z/ > —aQ{u)}. 

We call Qu = Q ^ {m E : m • u = — ag(z/)} the face of Q determined 
by V. Fig. 7.2 illustrates two faces, one a vertex and the other an edge. 

Exercise 4. Draw a picture of a cube in R^ with three supporting hyper- 
planes which define faces of dimensions 0, 1, and 2 respectively. Be sure to 
include the inward pointing normals in each case. 

Every face of Q is a poly tope of dimension less than dim Q. Vertices are 
faces of dimension 0 (i.e., points) and edges are faces of dimension 1. If 
Q has dimension n, then facets are faces of dimension n — 1. Assuming 




Figure 7.2. Supporting Hyperplanes, Inward Normals, and Faces 
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Q c a faetit li(^s on a unique supporting hyperplane and hence has 

a uni(iu(^ inward pointing normal (up to a positive multiple). In contrast, 
fac(*H of low(T (linKULsion lie in infinitely many supporting hyperplanes. For 
exainphi, th(! vcirtex at the origin in Fig 7.2 is cut out by infinitely many 
liiu^H through the origin. 

W(5 can characterize an n-dimensional polytope Q C in terms of 
its facets as follows. If .F C Q is a facet, we just noted that the inward 
normal is determined up to a positive constant. Suppose that Q has facets 
.7^1, ... , !Fn with inward pointing normals i^i, . . . , respectively. Each 
facet has a supporting hyperplane defined by an equation m- Uj = —aj 
for some aj. Then one can show that the polytope Q is given by 

(1.4) Q — {m G : m • Uj > —aj for all j = 1 , . . . , N}. 

In the notation of (1.3), note that aj = aQ{i'j). 

Exercise 5. How does (1.4) change if we use an outward normal for each 
facet? 

When Q is a lattice polytope, we can rescale the inward normal ujr of a 
facet T so that vj: has integer coordinates. We can also assume that the 
coordinates are relatively prime. In this case, we say the vj: is primitive. 
It follows that has a unique primitive inward pointing normal vjr G 1/^. 
For lattice polytopes, we will always assume that the inward normals have 
this property. 

Exercise 6. For the lattice polygon Q of Fig. 7.2, find the inward pointing 
normals. Also, if ei, C2 are the standard basis vectors for then show that 
the representation (1.4) of Q is given by the inequalities 

m • ei > 0, m • 62 > 0, m • (—62) > —1, m • (— ei — 62) > —2. 

Exercise 7. Let ei, . . . , Cn be the standard basis of R^. 

a. Show that the simplex Qd C R^ of Exercise 2 is given by the inequalities 

m • i/Q > — d, and m • i/j > 0, j = 1, . . . , n, 

where uq = — ei — • - • — Cn and uj = ej for j = 1, . . . , n. 

b. Show that the square Q = Conv({(0, 0), (1,0), (0, 1), (1, 1)}) C R^ is 
given by the inequalities 

m • z/i > 0, m ' 1/2 > —1, m • 1/3 > 0, and m • U4 > —I, 

where ei = z/i = —1/2 and 62 = 1^3 = —U4. A picture of this appears in 

Fig. 7.3 on the next page (with shortened inward normals for legibility). 

One of the themes of this chapter is that there is very deep connection 
between lattice polytopes and polynomials. To describe the conection, we 
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will use the following notation. Let / G C[xi, . . . , Xn] (or, more generally, 
in fc[xi, . . . , Xn] for any field of coefficients), and write 

/ = 

aezio 

The Newton polytope of /, denoted NP(/), is the lattice polytope 

NP(/) = Conv({a G Z^o * ^ 0}). 

In other words, the Newton polytope records the “shape” or “sparsity struc- 
ture” of a polynomial — it tells us which monomials appear with nonzero 
coefficients. The actual values of the coefficients do not matter, however, 
in the definition of NP(/). 

For example, any polynomial of the form 

/ = axy -h bx^ + cy^ + d 

with a, 6, c, d 0 has Newton polytope equal to the triangle 

Q = Conv({(l,l),(2,0),(0,5),(0,0)}). 

In fact, (1.2) shows that polynomials of this form with a — 0 would have 
the same Newton polytope. 

Exercise 8. What is the Newton polytope of a 1-variable polynomial / = 
SHo assuming that Cm 7^ 0, so that the degree of / is exactly m? 
Are there special cases depending on the other coefficients? 

Exercise 9. Write down a polynomial whose Newton polytope equals the 
polytope Qd from Exercise 2. Which coefficients must be non-zero to obtain 
NP(/) = Qrf? Which can be zero? 
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We can also go the other way, from exponents to polynomials. Suppose 
we have a finite set of exponents A = {ai, . . . , ai} C Z>q. Then let L{A) 
be the set of all polynomials whose terms all have exponents in A. Thus 

L{A) = {cix^^^ H + : Ci G C}. 

Note that L{A) is a vector space over C of dimension I (= the number of 
elements in A). 

Exercise 10. 

a. If / G L{A), show that NP(/) C Conv(^). Give an example to show 
that equality need not occur. 

b. Show that there is a union of proper subspaces W C L{A) such that 
NP(/) = Conv(.A) for all / G L{A) \ W. This means that NP(/) = 
Conv(^) holds for a generic f G L{A). 

Exercise 11. If Ad is as in Exercise 2, what is L(Ad)'? 

Finally, we conclude this section with a slight generalization of the notion 
of monomial and polynomial. Since the vertices of a lattice polytope can 
have negative entries, it will be useful to have the corresponding algebraic 
objects. This leads to the notion of a polynomial whose terms can have 
negative exponents. 

Let a = (ai,...,an) G be an integer vector. The corresponding 
Laurent monomial in variables xi, ... ,Xn is 

_ ai an 

For example, x‘^y~^ and x^^^y^ are Laurent monomials in x and y whose 
product is 1. More generally, we have 

x°‘ -x^ = x°‘+>^ and = 1 

for all a, /3 G Finite linear combinations 

/ = ^ CaX“ 

ckGZ” 

of Laurent monomials are called Laurent polynomials, and the collection 
of all Laurent polynomials forms a commutative ring under the obvious 
sum and product operations. We denote the ring of Laurent polynomials 
with coefiicients in a field k by k[xf^, . . . , x^^]. See Exercise 15 below for 
another way to understand this ring. 

The definition of the Newton polytope goes over unchanged to Laurent 
polynomials; we simply allow vertices with negative components. Thus any 
Laurent polynomial / G k[xf^, . . . ,x^^] has a Newton polytope NP(/), 
which again is a lattice polytope. Similary, given a finite set ^ C Z’^, 
we get the vector space L(A) of Laurent polynomials with exponents in A. 
Although the introduction of Laurent polynomials might seem unmotivated 
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at this point, they will prove to be very useful in the theory developed in 
this chapter. 



Additional Exercises for §1 

Exercise 12. This exercise will develop the theory of afBne subspaces. An 
affine subspace A C is a subset with the property that 

Si, . . . , 5m e A => ^ ^ whenever = 1- 

Note that we do not require that > 0. We also need the following 
definition: given a subset S C and a vector v G E’^, the translate of S 
by V is the set v S = {v s : s E S}. 

a. If A C E’^ is an affine subspace and v E A, prove that the translate 
— V + A is a subspace of E’^. Also show that A = v A {—v A A) ^ so that 
A is a translate of a subspace. 

b. If t;, It; G A, prove that — v + A = —w -h A. Conclude that an afiine 
subspace is a translate of a unique subspace of E’^. 

c. Conversely, if W C E’^ is a subspace and v G E”, then show that the 
translate v + W is an affine subspace. 

d. Explain how to define the dimension of an affine subspace. 

Exercise 13. This exercise will define the dimension of a polytope Q C 
E^. The basic idea is that the dim Q is the dimension of the smallest afiine 
subspace containing Q. 

a. Given any subset S' C E^, show that 

AfF(S) = {Ai5i 4- • • • + Am5m : 5, G S, ZT=i^i = 1} 

is the smallest afiine subspace containing S. Hint: Use the strategy 
outlined in parts b, c and d of Exercise 1. 

b. Using the previous exercise, explain how to define the dimension of a 
polytope Q C E’^. 

c. If A = {mi, . . . ,m^} and Q = Conv(A), prove that dimQ = dim W, 
where W C E’^ is the subspace spanned by m 2 — mi, . . . , m/ — mi. 

d. Prove that a simplex in E^ (as defined in Exercise 2) has dimension n. 

Exercise 14. Let Q C E’^ be a polytope and u he a nonzero vector. 

a. Show that m • 1 / = 0 defines a subspace of E”^ of dimension n — 1 and 
that the afiine hyperplane m • 1 / = — a is a translate of this subspace. 
Hint: Use the linear map E’^ — > E given by dot product with i/. 

b. Explain why minmegC^ * exists. Hint: Q is closed and bounded, and 
m m • u is continuous. 

c. If ag(i/) is defined as in (1.3), then prove that the intersection 

Qj, = Q n {m E : m • u = -ag(z^)} 
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is nonempty and that 

Q C {m C : m • z/ > —(Iq{i^)}- 

Exercise 15. There are several ways to represent the ring of Laurent 
polynomials in xi, . . . , 0 :^ as a quotient of a polynomial ring. Prove that 

k[xf^, X**] ~ fc[xi, . . . ,Xn,h, . . . , tn]/ {xih - 1, , Xntn ~ 1} 

~ k[xi, ...,Xn, t]/{Xi ■■•Xnt- 1). 

Exercise 16. This exercise will study the translates of a polytope. The 
translate of a set in is defined in Exercise 11. 

a. If .4 C is a finite set and u G prove that Couv{v -h 4) = 
V + Conv(4). 

b. Prove that a translate of a polytope is a polytope. 

c. If a polytope Q is represented by the inequalites (1.4), what are the 
inequalities defining v + Q? 

Exercise 17. If / G k[xf ^^ . . . , is a Laurent polynomial and a G 
how is NP{x^ f) related to NP(/)? Hint: See the previous exercise. 



§2 Sparse Resultants 

The multipolynomial resultant Res<i^^ . . . , discussed in Chap- 

ter 3 is a very large polynomial, partly due to the size of the input 
polynomials Fi, . . . , Fn. They have lots of coefficients, especially as their 
total degree increases. In practice, when people deal with polynomials of 
large total degree, they rarely use all of the coefficients. It’s much more com- 
mon to encounter sparse polynomials, which involve only exponents lying 
in a finite set 4 C This suggests that there should be a corresponding 
notion of sparse resultant. 

To begin our discussion of sparse resultants, we return to the implic- 
itization problem introduced in §2 of Chapter 3. Consider the surface 
parametrized by the equations 

X = f{s, t) = flo + o,is + a2t + a^st 
(2.1) y = g{s, t) = 6o + hs + 62^ + hst 

Z = h{s, t) = Co + CiS -f C2t + CsSt, 

where ao, . . . , C3 are constants. This is sometimes called a bilinear surface 
parametrization. We will assume 

( CLl o >2 \ 

b\ 62 ^3 I 7 ^ 0 

Cl C2 C3 / 



( 2 . 2 ) 
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In Exercise 7 at the end of the section, you will show that this condition 
rules out the trivial case when (2.1) parametrizes a plane. 

Our goal is to find the implicit equation of (2.1). This means finding a 
polynomial p{x, y, z) such that p(x, y^z) = 0 if and only if x, y, 2? are given 
by (2.1) for some choice of s, t. In Proposition (2.6) of Chapter 3, we used 
the resultant 

(2.3) p(x, 2/, z) = Res2,2,2(^ - G - yu^, H - zu^) 

to find the implicit equation, where F, G, H are the homogenization of 
/, h with respect to u. Unfortunately, this method fails for the case at 
hand. 

Exercise 1. Show that the resultant (2.3) vanishes identically when 
F,G,H come from homogenizing the polynomials in (2.1). Hint: You 
already did a special case of this in Exercise 2 of Chapter 3, §2. 

The remarkable fact is that although the multipolynomial resultant from 
Chapter 3 fails, a sparse resultant still exists in this case. In Exercise 2 
below, you will show that the implicit equation for (2.1) is given by the 
determinant 

a\ U2 U3 0 0 ^ 

h\ &2 ^3 0 0 

Cl C2 C3 0 0 

a(^ — X 0 02^1 U3 

bo — y 0 62 bi 63 

Co - X 0 C2 Cl C3 / 

Expanding this 6x6 determinant, we see that p(x, y, z) is a polynomial of 
total degree 2 in x, y and 0. 

Exercise 2. 

a. If X, y, z are as in (2.1), show that the determinant (2.4) vanishes. Hint: 
Consider the system of equations obtained by multiplying the each 
equation of (2.1) by 1 and s. You should get 6 equations in the 6 “un- 
knowns” 1, s, t, st, st^. Notice the similarity with Proposition (2.10) 
of Chapter 3. 

b. Next assume (2.4) vanishes. We want to prove the existence of s, t such 

that (2.1) holds. As a first step, let A be the matrix of (2.4) and explain 
why we can find a nonzero column vector v = (ai, Q2 j OieY 

{t denotes transpose) such that Av = 0. Then use (2.2) to prove that 
ai ^ 0. Hint: Write out Av = 0 explicitly and use the first three 
equations. Then use the final three. 

c. If we take the vector v of part b and multiply by 1/ai, we can write v 
in the form v = (1, s, t, a, 13, 7). Explain why it suffices to prove that 
a = st. 



(2.4) p{x, y, z) = det 



( ao- X 

bo -y 

Co - z 
0 
0 

\ 0 
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d. Use (2.2) to prove a = st, p — and 7 = sa. This will complete the 
proof that the implicit equation of (2.1) is given by (2.4). Hint: In the 
equations Av = 0, eliminate ao — bo — y^co — z. 

e. Explain why the above proof gives a linear algebra method to find s, t for 
a given point (x, y, z) on the surface. This solves the inversion problem 
for the parametrized surface. Hint: In the notation of part b, you will 
show that s — ol^Iolx and i = Oizla\. 

A goal of this section is to explain why a resultant like (2.4) can exist even 
though the standard multipolynomial resultant (2.3) vanishes identically. 
The basic reason is that although the equations (2.1) are quadratic in s, 
they do not use all monomials of total degree < 2 in s, t. The sparse 
resultant works like the multipolynomial resultant of §2, except that we 
restrict the exponents occuring in the equations. 

For simplicity, we will only treat the special case when all of the equa- 
tions have exponents lying in the same set, leaving the general case 
for §6. We will also work exclusively over the field C of complex num- 
bers. Thus, suppose that the variables are ti, ... ,tn, and fix a finite set 
A = {mi, . . . , m/} C of exponents. Since negative exponents can occur, 
we will use the Laurent polynomials 

f = a\t^^ + • • • + ajit^^ E L(.A), 

as defined in §1. Given /o, • • • , /n ^ we get n + 1 equations in n 

unknowns ti, . . . , 

/o — + • • • + aoit^^ — 0 

(2.5) : 



/n = H h = 0. 

In seeking solutions of these equations, the presence of negative exponents 
means that we should consider only nonzero solutions of (2.5). We will use 
the notation 



c* = C \ {0} 

for the set of nonzero complex numbers. 

The sparse resultant will be a polynomial in the coefficents aij which 
vanishes precisely when we can find a “solution” of (2.5). We put “so- 
lution” in quotes because although the previous paragraph suggests that 
solutions should lie in (C*)’^, the situation is actually more complicated. 
For instance, the multipolynomial resultants from Chapter 3 use homo- 
geneous polynomials, which means that the “solutions” lie in projective 
space. The situation for sparse resultants is similar, though with a twist: 
a “solution” of (2.5) need not lie in (C*)’^, but the space where it does lie 
need not be For example, we will see in §3 that for equations like (2.1), 
the “solutions” lie in P^ x P^ rather than P^. 
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To avoid the problem of where the solutions lie, we will take a conser- 
vative approach and initially restrict the solutions to lie in (C*)’^. Then, 
in (2.5), the coefficients give a point {aij) G and we consider the 

subset 



Zq{A) — {{aij) e : (2.5) has a solution in (C*)’^}. 

Since Zq{A) might not be a variety in we use the following fact: 

• (Zariski Closure) Given a subset S C C"^, there is a smallest affine 
variety S C containing S, We call S the Zariski closure of S. 

(See, for example, [CLO], §4 of Chapter 4.) Then let Z{A) = Zq{A) be the 
Zariski closure of Zo{A). 

The sparse resultant will be the equation defining Z{A) C To 

state our result precisely, we introduce a variable Uij for each coefficient Oij . 
Then, for a polynomial P £ C[uij], we let P(/o, • • . , fn) denote the number 
obtained by replacing each variable Uij with the corresponding coefficient 
Oij from (2.5). We can now state the basic existence result for the sparse 
resultant. 

(2.6) Theorem. Let A C be a finite set, and assume that Conv(.A) is 
an n-dimensional polytope. Then there is an irreducible polynomial Res^ G 
Z[uij] such that for (aij) G we have 

{aij) G Z{A) 4==> Res^(aij) = 0. 

In particular, if (2.5) has a solution with h, ... ,tn G C*, then 

Res^(/o, . . . , /n) = 0. 

Proof. See [GKZ], Chapter 8. □ 

The sparse resultant or w4-resultant is the polynomial Res^. Notice that 
Res^ is determined uniquely up to ± since it is irreducible in Z[uij]. The 
condition that the convex hull of A has dimension n is needed to ensure 
that we have the right number of equations in (2.5). Here is an example of 
what can happen when the convex hull has strictly lower dimension. 

Exercise 3. Let A = {(1,0), (0, 1)} C so that fi = anti + ai 2^2 for 
i = 0, 1, 2. Show that rather than one condition for /i = /2 = /s = 0 
to have a solution, there are three. Hint: See part b of Exercise 1 from 
Chapter 3, §2. 

We next show that the multipolynomial resultant from Chapter 3 is a 
special case of the sparse resultant. For d > 0, let 

Ad = {m e Z>o : \m\ < d}. 
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Also consider variables ... ,Xn, which will be related to ... ,tn by 
U = Xi/xo for 1 < i < n. Then we homogenize the fi from (2.5) in the 
usual way, defining 

(2.7) F i(xo, • • • 5 Xfi) = • • • > ^n) ~ • • • j ^n/ ^o) 

for 0 < z < n. This gives n -h 1 homogeneous polynomials Fi in the n + 1 
variables xq, . . . , Note that the Fi all have total degree d. 

(2.8) Proposition. For Ad = {m G Z>q : \m\ < d}, we have 

ReS^j(/o, . . . , fn) • • • ) -^n)> 

where Resd,...,d is the multipolynomial resultant from Chapter 3. 

Proof. If (2.5) has a solution (^i , . . . ,tn) G (C*)’^, then (xq, . . . , Xn) = 
(1, ^ 1 , . . . , tn) is a nontrivial solution of JFq = • • • = Fn = 0. This shows 
that Resd,...,d vanishes on Zo{Ad)- By the definition of Zariski closure, it 
must vanish on Z{Ad)- Since Z{Ad) is defined by the irreducible equation 
Res^^ = 0, the argument of Proposition (2.10) of Chapter 3 shows that 
R^^d,...,d is a multiple of Res^^. But Resd,...,d is an irreducible polynomial 
by Theorem (2.3) of Chapter 3, and the desired equality follows. □ 

Because Ad = {m € Z"q : [ml < d} gives all exponents of total degree 
at most d, the multipolynomial resultant Resd,...,^ is sometimes called the 
dense resultant, in contrast to the sparse resultant Res^. 

We next discuss the structure of the polynomial Res^ in more detail. 
Our first question concerns its total degree, which is determined by the 
convex hull Q = Conv(.A). The intuition is that as Q gets larger, so does 
the sparse resultant. As in §1, we measure the size of Q using its volume 
Voln(Q). This affects the degree of Res^ as follows. 

(2.9) Theorem. Let A = {mi, . . . , m^}, and assume that every element 
ofZ^ is an integer linear combination of m 2 — mi, ... ,mi — mi. Then, if 
we fix i between 0 and n, Res^ is homogeneous in the coefficients of each 
fi of degree n! Voln(Q), where Q = Conv(^). This means that 

Res^(/o, A/i, = A"’ , /„). 

Furthermore, the total degree of Res^ is {n + 1)! Voln(Q). 

Proof. The first assertion is proved in [GKZ], Chapter 8. As we observed 
in Exercise 1 of Chapter 3, §3, the final assertion follows by considering 
Res^(A/o, . . . , A/n). □ 

For an example of Theorem (2.9), note that Ad = (m G Z>q : \m\ < d} 
satisfies the hypothesis of the theorem, and its convex hull has volume d^/n\ 
by Exercise 3 of §1. Using Proposition (2.8), we conclude that Resd,...^ 




§2. Sparse Resultants 303 

has degree in Fi, This agrees with the prediction of Theorem (3.1) of 
Chapter 3. 

We can also explain how the hypothesis of Theorem (2.9) relates to 
Theorem (2.6). If the rrii — mi span over Z, they also span over R, so that 
the convex hull Q = Conv(w4) has dimension n by Exercise 13 of §1. Thus 
Theorem (2.9) places a stronger condition on ^ C Z’^ than Theorem (2.6). 
The following example shows what can go wrong if the — mi don’t span 
over Z. 



Exercise 4. Let A = {0, 2} C Z, so that Voli(Conv(.4)) = 2. 

a. Let fo = aoi + ao 2 t^ and fi = an -f ai 2 t^. If the equations /o = /i = 0 
have a solution in (C*)^, show that aoiai 2 — ao 2 aii = 0. 

b. Use part a to prove Res^(/o, fi) = aoiai 2 — ao 2 «ii- 

c. Explain why the formula of part b does not contradict Theorem (2.9). 



Using Theorem (2.9), we can now determine some sparse resultants 
using the methods of earlier sections. For example, suppose A = 
{(0, 0), (1, 0), (0, 1), (1, 1)} C Z^, and consider the equations 

/(s, t) = ao + ais + a2t + asst = 0 
( 2 . 10 ) g{s^ t) = bo bis 62 ^ + b^st = 0 

h{s, t) = Co + CiS -f C 2 t + CsSt = 0. 



The exercise below will show that that in this case, the sparse resultant is 
given by a determinant: 



( 2 . 11 ) 



Res^(/, 9 ^h) = ± det 



Oo 


ai 


CL2 


^^3 


0 


0 \ 


bo 


bi 


b2 


bs 


0 


0 


Co 


Cl 


C2 


Cs 


0 


0 


0 


ao 


0 


OL2 


ai 


as 


0 


bo 


0 


&2 


bi 


bo 


\o 


Co 


0 


C2 


Cl 


Co) 



Exercise 5. As above, let A = {(0, 0), (1, 0), (0, 1), (1, 1)}. 

a. Adapt the argument of Exercise 2 to show that if (2.10) has a solution 
in (C*)^, then the determinant in (2.11) vanishes. 

b. Adapt the argument of Proposition (2.10) of Chapter 3 to show that 
Res^ divides the determinant in (2.11). 

c. By comparing degrees and using Theorem (2.9), show that the 
determinant is an integer multiple of Res^. 

d. Show that the integer is ±1 by computing the determinant when / = 
1 st^ g = s and h = t. 



It follows that the implicitization problem (2.1) can be solved by setting 
(2.12) p{x, y, z) = Res^(/ - x,g - y,h- z), 
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where A is as above. Comparing this to (2.3), we see from Proposition (2.8) 
that Res 2 , 2,2 corresponds to A 2 = AU {(2, 0), (0, 2)}. The convex hull of 
A 2 is strictly larger than the convex hull of A. This explains why our earlier 
attempt failed — the convex hull was too big! 

We also have the following sparse analog of Theorem (3.5) discussed in 
Chapter 3. 

(2.13) Theorem. When A satisfies the hypothesis of Theorem (2.9), the 
resultant Res^ has the following properties: 

a. If Qi = '^here (bij) is an invertible matrix, then 

Res^(ffo, ■■■, 9 n) = det(6y )"• , /„). 

b. Given indices 1 < ko < • • • < kn < h the bracket [fco . . . kn] is defined 
to be the determinant 

[k() . . . kyi\ = G Z[uijJ. 

Then Res^ is a polynomial in the brackets [ko . . . kn] • 

Proof. See [GKZ], Chapter 8. As explained in the proof of Theorem (3.5) 
of Chapter 3, the second part follows from the first. In §4, we will prove 
that n! Vol(Q) is an integer since Q is a lattice polytope. □ 

Exercise 6. As in Exercise 5, let A = {(0, 0), (1, 0), (0, 1), (1, 1)}. Then 
prove that 

(2.14) Res^(/, 9 , h) = [013][023] - [012][123]. 

Hint: Expand the determinant (2.11) three times along certain well-chosen 
rows and columns. 

The answer to Exercise 6 is more interesting than first meets the eye. La- 
bel the points mA= {(0, 0), (1, 0), (0, 1), (1, 1)} as 0, 1, 2, 3, corresponding 
to the subscripts of the coefl[icients in (2.10). Then the brackets appearing 
in (2.14) correspond to the two ways of dividing the square Q = Conv(>l) 
into triangles. This is illustrated in Fig. 7.4 on the next page, where the 
figure on the right corresponds to [013] [023], and the one on the left to 
[012] [123]. 

The amazing fact is that this is no accident! In general, when we express 
Res^ as a polynomial in the brackets [k^ . . .kn], there is a very deep re- 
lationship between certain terms in this polynomial and triangulations of 
the polytope Q = Conv(^). The details can be found in [KSZ]. See also 
[Stu4] for some nice examples. 

Many of the other properties of multipolynomial resultants mentioned 
in §3 and §4 have sparse analogs. We refer the reader to [GKZ, Chapter 8] 
and [PS2] for further details. 
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Figure 7.4. Triangulations of the Unit Square 



Our account of sparse resultants is by no means complete, and in 

particular, we have the following questions: 

• When Res^(/o, . . . , fn) vanishes, the equations (2.5) should have a solu- 
tion, but where? In §3, we will see that toric varieties provide a natural 
answer to this question. 

• What happens when the polynomials in (2.5) have exponents not lying 
in the same set A? We will explore what happens in §6. 

• How do we compute Res^(/o, • • • , /n)? We will (very briefly) sketch one 
method in §6. 

• What are sparse resultants good for? We’ve used them for implicit izat ion 
in (2.12), and applications to solving equations will be covered in §6. A 
brief discussion of applications to geometric modeling, computational 
geometry, vision and molecular structure can be found in [Emi2]. 



Additional Exercises for §2 

Exercise 7. Let B be the 3x3 matrix in (2.2). In this exercise, we will 
show that the parametrization (2.1) lies in a plane ax Py = 6 ii 

and only if det(H) = 0. 

a. First, if the parametrization lies in the plane ax-{-Py-\-^z = <5, then show 
that Bv = 0, where v = (a, /3, 7)^ Hint: If a polynomial in s, t equals 
zero for all values of s and t, then the coefficients of the polynomial must 
be zero. 

b. Conversely, if det(H) = 0, then we can find a nonzero column vector 
V = (a, /?, 7)^ such that Bv = 0. Show that ax -{■ Py ^z = 6 for an 
appropriately chosen 6, 
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Exercise 8. Given A = {mi, . . . , mi} C and v G let v A = 
{v + mi, . . . , V + mi}. Explain why Res^ = Res-y_|_^. Hint: Remember that 
in defining the resultant, we only use solutions of the equations (2.5) with 
tl,...,tn eC\ 

Exercise 9. For A = {(0, 0), (1, 0), (0, 1), (1, 1), (2, 0)}, compute Res^ 
using the methods of Exercise 5. Hint: Let the variables be s, and let the 
equations be/ = ^ = h = 0 with coefficients ao, . . . , C4. Multiply each of 
the three equations by 1, s, t. This will give you a 9 x 9 determinant. The 
tricky part is finding polynomials /, g, h such that the determinant is ±1. 
See part d of Exercise 5. 

Exercise 10. This exercise will explore the Dixon resultant introduced by 
Dixon in 1908. See Section 2.4 of [Stu4] for some nice examples. Let 

Ai^rn = {(a, 6 )eZ^: 0 <a<Z, 0<6< m}. 

Note that Ai^rn has {I -h l)(m + 1) elements. Let the variables be s, t. Our 
goal is to find a determinant formula for Res^^ 

a. Given f,g,h G L{Ai^m)i we get equations f = g = h = 0. Mul- 
tiplying these equations by s^t^ for (a, 5) G A 2 i-i,m-i^ show that 
you get a system of 6Zm equations in the 6Zm “unknowns” s^t^ for 
(a, b) G Asi-i^ 2 m-i- Hint: For Z = m = 1, this is exactly what you did 
in Exercise 1. 

b. If A is the matrix of part a, conclude that det(yl) = 0 whenever / = 
g = h = 0 has a solution (s, t) G (C*)^. Also show that det(A) has 
total degree 2Zm in the coefficients of /, and similarly for g and h. 

c. What is the volume of the convex hull of Ai^rn^ 

d. Using Theorems (2.6) and (2.9), show that det(A) is a constant multiple 
of Res^j 

e. Show that the constant is ±1 by considering f = 1 g = and 

h = Hint: In this case, A has 4Zm rows with only one nonzero entry. 
Use this to reduce to a 2Zm x 2Zm matrix. 



§3 Toric Varieties 

Let A = {mi, . . . , mj C Z’^, and suppose that 

fi = anf^^ H h auf^^ , i = 0 , . . . , n 

are n -f 1 Laurent polynomials in L{A). The basic question we want to 
answer in this section is: 7/Res^(/o , . . . , fn) =0, where do the equations 

(3.1) /o = ... = /, = 0 

have a solution? In other words, what does it mean for the resultant to 
vanish? 
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For Ad = {m G Z>q • \^\ ^ d}^ we know the answer. Here, we 
homogenize /o, • • • , /n as in (2.7) to get Fq, . . . , Proposition (2.8) 
implies 



^^^Adifoi • • • j /n) — P'6S<i^...^ci(Fo, . • . 5 Fti), 



and then Theorem (2.3) of Chapter 3 tells us 



(3.2) 



Resd,...,d(Fo, . . . , Fn) = 0 



r Fo = • • • = Fn = 0 
\ has a nontrivial solution. 



Recall that a nontrivial solution means {xq, . . . , Xn) ^ (0, . . . , 0), i.e., a 
solution in P^. Thus, by going from (C*)”^ to P’^ and changing to homo- 
geneous coordinates in (3.1), we get a space where the vanishing of the 
resultant means that our equations have a solution. 

To understand what happens in the general case, suppose that A = 
{mi, . . . ,m/} C Z> 0 ) and assume that Q = Conv(.A) has dimension n. 
Then consider the map 

(l>A : (C*)^ P^"^ 



defined by 

(3.3) 0.A(tl,...,tn) = (t"^S...,t"^O. 

Note that (t’^S . . . , t"^^) is never the zero vector since U G C* for all i. 
Thus (j)j^ is defined on all of (C*)’^, though the image of 0^ need not be a 
subvariety of P^“^. Then the toric variety is the Zariski closure of the 
image of 0^, i.e., 

Xa = 0((C*)^) C P^~^ 



Toric varieties are an important area of research in algebraic geometry and 
feature in many applications. The reader should consult [GKZ] or [Stu2] 
for an introduction to toric varieties. There is also a more abstract theory 
of toric varieties, as described in [Ful]. 

For us, the key fact is that the equations fi — ant^^ H h — 0 

from (3.1) extend naturally to Xa- To see how this works, let ui, . . . , 
be homogeneous coordinates on P^~^. Then consider the linear function 
Li = anU\ + • • • H- auui, and notice that fi = Li o (j)A. However, Li is not 
a function on P^~^ since Ui, . . . , u/ are homogeneous coordinates. But the 
equation = 0 still makes sense on P^“^ (be sure you understand why), 
so in particular, Li = 0 makes sense on Xa- Since Li and fi have the same 
coeflBcients, we can write Res^(Lo, . . . , Ln) instead of Res^(/o, . . . , fn)- 
Then we can characterize the vanishing of the resultant as follows. 



(3.4) Theorem. 



Res^(I/o, • • • , Ln) = 0 



r Lo = • • • = Fn = 0 
{ has a solution in Xa- 
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Proof. See Proposition 2.1 of Chapter 8 of [GKZ]. This result is also 
discussed in [KSZ] and is generalized in [Roj5]. □ 



This theorem tells us that the resultant vanishes if and only if (3.1) has a 
solution in the toric variety From a more sophisticated point of view, 
Theorem (3.4) says that Res^ is closely related to the Chow form of X^. 

To get a better idea of what Theorem (3.4) means, we will work out two 
examples. First, if Ad = {m e ^ 1^1 < let’s show that X^d = 

Let xq, ... ,Xn be homogeneous coordinates on P’^, so that by Exercise 19 
of Chapter 3, §4, there are N = (^^”^) monomials of total degree d in 
xq^ ... ,Xn- These monomials give a map 

: pn — , piv-i 



defined by ^d(xo, . . . , Xn) = (..., x", . . .), where we use all monomials x" 
of total degree d. In Exercise 6 at the end of the section, you will show 
that is well-defined and one-to-one. We call ^d the Veronese map. The 
image of ^d is a variety by the following basic fact. 

• (Projective Images) Let ^ : P*^ — > p^“i be defined by ^(xq, . . . , Xn) = 
(hi, . . . , hiv), where the hi are homogeneous of the same degree and 
don’t vanish simultaneously on P’^. Then the image ^(P^) C P^~i is a 
variety. 



(See §5 of Chapter 8 of [CLO].) For ti, . . . , tn ^ C*, observe that 
(3.5) ^c^(l, ti, • • • , tn) 0.4d(tl) • • • ) t/i). 



where (p^d i® from (3.3) (see Exercise 6). Thus $d(P’^) is a variety containing 
0^^((C*)^), so that X^d C $d(P’^). Exercise 6 will show that equality 
occurs, so that X^^ = $d(P’^). Finally, since ^d is one-to-one, P^ can be 
identified with its image under ^d (we are omitting some details here), 
and we conclude that X^^ = P’^. It follows from Theorem (3.4) that for 
homogeneous polynomials Fq, ... ,Fn of degree d. 



Resd,...,d(Fo, . . . , Fn) = 0 



f Fo = • • • = Fn = 0 

\ has a solution in P’^. 



Thus we recover the characterization of Resd,...,d given in (3.2). 

For a second example, you will show in the next exercise that P^ x P^ 
is the toric variety where the equations (2.10) have a solution when the 
resultant vanishes. 



Exercise 1. Let A = {(0, 0), (1, 0), (0, 1), (1, 1)}. Then (/>^(s, t) = 
(l,s, st) G P^ and X^ is the Zariski closure of the image of 0^. A 
formula for Res^ is given in (2.11). 

a. Let the coordinates on P^ x P^ be (u, s, t), so that {u, s) are homoge- 
neous coordinates on the first P^ and (u, t) are homogeneous coordinates 
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on the second. Show that the Segre map ^ x — > P^ defined by 

s, V, t) = {uv^ sv^ ut, st) is well-defined and one-to-one. 

b. Show that the image of ^ is Xa and explain why this allows us to 
identify P^ x P^ with Xa- 

c. Explain why the “homogenizations” of /, h from (2.10) are 

F{u, s, V, t) = aouv 4- a\sv -f a 2 ut + a^st = 0 
(3.6) G{u, s, u, t) = 60^^^; 4- 61SU 4- b 2 ut 4- b^st = 0 

H{u, 5, V, t) = cquv 4- Cl si; 4- C 2 ut -f- c^st = 0. 

and then prove that Res^(F, G, ff) = 0 if and only if F — G = H — 0 
has a solution in P^ x P^. In Exercises 7 and 8 at the end of the section, 
you will give an elementary proof of this assertion. 

Exercise 1 can be restated as saying that Res^(F, G, if) = 0 if and only 
if F = G = H = 0 has a nontrivial solution (ix, s, v, t), where nontrivial 
now means (tx, 5) ^ (0,0) and {v^t) ^ (0,0). This is similar to (3.2), 
except that we “homogenized” (3.1) in a different way, and “nontrivial” 
has a different meaning. 

Our next task is to show that there is a systematic procedure for homog- 
enizing the equations (3.1). The key ingredient will again be the polytope 
Q = Conv(.A). In particular, we will use the facets and inward normals 
of Q, as defined in §1. If Q has facets .Fi, . . . ,F/v with inward pointing 
normals ... respectively, each facet Tj lies in the supporting hy- 
perplane defined by m • Uj = — aj, and according to (1.4), the polytope Q 
is given by 

(3.7) Q = {m : m • i/j > —aj for all j = 1, . . . , N}. 

As usual, we assume that Uj G is the unique primitive inward pointing 
normal of the facet Fj. 

We now explain how to homogenize the equations (3.1) in the general 
case. Given the representation of Q as in (3.7), we introduce new vari- 
ables xi, . . . , Xiv- These “facet variables” are related to ... ,tn by the 
substitution 

(3.8) ti = z = 1, . . . , n 

where uji is the zth coordinate of Uj. Then the “homogenization” of 
/(fi, . . . ,fn) is 

(3.9) F{xi, ...,Xn)= (lljLl • • • ’ 

where each U is replaced with (3.8). Note the similarity with (2.7). The 
homogenization of the monomial will be denoted An explicit 

formula for will be given below. 
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Since the inward normals vj can have negative coordinates, negative 
exponents can appear in (3.8). Nevertheless, the following lemma shows 
that has no negative exponents in the case we are interested in. 

( 3 . 10 ) Lemma. Ifm G Q, then is a monomial m xi, . . . , xn with 
nonnegative exponents. 

Proof. Write m e asm = Since Uji = vj • e^, (3.8) implies 

(3.11) 

from which it follows that 

_ m-i/2+02 . . . ^m-i/N+aN 

— o/ j X 2 jy • 

Since m G Q, (3.7) implies that the exponents of the Xj are >0. □ 

Exercise 2. Give a careful proof of (3.11). 

Exercise 3 . If we used H-a^ rather than —aj in the description of Q = 
Conv(*4) in (3.7), what effect would this have on (3.9)? This explains the 
minus signs in (3.7): they give a nicer homogenization formula. 

From the equations (3.1), we get the homogenized equations 

Fo = + • • • + = 0 



Fn = + • • • + a„,x“('"‘) = 0, 

where Fi is the homogenization of fi. Notice that Lemma (3.10) applies 
to these equations since G .4 C Q for all i. Also note that Fq^ ... ,Fn 
and /o, • • • , /n have the same coefficients, so we can write the resultant as 
Res^(F(3, . . . , Ffi^. 

Exercise 4. 

a. For Ad = {m G Z>q : |^| < let the facet variables be xq, . . . , Xn, 
where we use the labelling of Exercise 3. Show that U = Xi I xq and that 
the homogenization of /(^i, . . . , tn) is given precisely by (2.7). 

b. For A = {(0,0), (1,0), (0, 1), (1, 1)}, the convex hull Q = Conv(4) in 
E? is given by the inequalities 

m • Us > 0^ m • Uu > —1, m • z/t > 0, and m • Uy > —1, 

where ei = Ug = —Uy and C 2 = Ut = —Uy. As indicated by the labelling 
of the facets, the facet variables are s, v, t. This is illustrated in Fig. 7.5 

on the next page. Show that the homogenization of (2.10) is precisely 
the system of equations (3.6). 
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Figure 7.5. Facet Normals of the Unit Square 



Our final task is to explain what it means for the equations Fq = • • • = 
Fn = 0 to have a “nontrivial” solution. We use the vertices of polytope 
Q for this purpose. Since Q is the convex hull of the finite set A C. it 
follows that every vertex of Q lies in i.e., the vertices are a special subset 
of A. This in turn gives a special collection of homogenized monomials 
which will tell us what “nontrivial” means. The precise definitions are as 
follows. 

(3.12) Definition. Let xi, . . . , be facet variables for Q = conv(^). 

a. If m G w4 is a vertex of Q, then we say that is a vertex monomial. 

b. A point (xi, . . . , Xjv) ^ is nontrivial if ^ 0 for at least one 

vertex monomial. 

Exercise 5. 

a. Let Ad and xq, . . . , Xn be as in Exercise 4. Show that the vertex mono- 
mials are Xq, . . • , x^, and conclude that (xq, . . . , x^) is nontrivial if and 
only if (xo, . . . , Xn) ^ (0, . . . , 0). 

b. Let A and u, s, v, t be as in Exercise 4. Show that the vertex monomials 
are uv, us^ vt^ st^ and conclude that (u, s, v^ t) is nontrivial if and only 
if (u, s) ^ (0, 0) and {v, t) ^ (0, 0). 

Exercises 4 and 5 shows that the homogenizations used in (2.7) and (3.6) 
are special cases of a theory that works for any set A of exponents. Once 
we have the description (3.7) of the convex hull of A, we can read off 
everything we need, including the facet variables, how to homogenize, and 
what nontrivial means. 

We now come to the main result of this section, which uses the facet 
variables to give necessary and sufficient conditions for the vanishing of the 
resultant. 
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(3.13) Theorem. Let A = {mi, . . . , m/} C Z|o be finite, and assume 
that Q = Conv(.A) is n-dimensional. If x\, ... ,xn are the facet variables, 
then the homogenized system of equations 

Fo = + • • • + = 0 



Fn = + • • • + anix^^^^^ = 0 

has a nontrivial solution in if and only if Res j^{Fq, . . . , Fn) = 0. 

Proof. Let U C consist of all nontrivial points, and notice that 
(C*)^ C U. Then consider the map ^ defined by 

^xi,...,xn) = 

Since the vertex monomials appear among the we see that 

^{xi, . . . , xn) 7^ (0, . . . , 0) when {xi,. . . , xn) € U. Thus ^ can be re- 
garded as a map ^ : U P^“^. By Theorem (3.4), it suffices to prove 

that the image of $ is the toric variety To prove this, we will use the 
following properties of the map 

(i) ^{U) is a variety in P^“^. 

(ii) $((C*)^) is precisely 0^((C*)’^). 

Assuming (i) and (ii), we see that 0>t((C*)’^) C ^{U), and since ^{U) is 
a variety, we have C ^{U). Then the argument of part d of Exercise 6 
shows that = ^{U), as desired. 

The proofs of (i) and (ii) are rather technical and use results from 
[BC] and [Cox]. Since Theorem (3.13) has not previously appeared in the 
literature, we will include the details. What follows is for experts only! 

For (i), note that [Cox] implies that $ factors 

t/ -> Xq P^-^ 

where Xq is the abstract toric variety determined by Q (see [Ful], §1.5). 
By Theorem 2.1 of [Cox], U — > Xq is a categorical quotient, and in fact, 
the proof shows that it is a universal categorical quotient (because C has 
characteristic 0 — see Theorem 1.1 of [FM]). A universal categorical quotient 
is surjective by §0.2 of [FM], so that U Xq is surjective. This shows 
that ^{U) is the image of Xq P^“^. Since Xq is a projective variety, a 
generalization of the Projective Images principle used earlier in this section 
implies that the image of Xq — > P^“^ is a variety. We conclude that ^{U) 
is a variety in P^”^. 

For (ii), first observe that the restriction of ^ to (C*)^ factors 

Ju (C*)’^ P^“^ 
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where ^ is given by (3.8) and is given by (3.3). To prove this, note that 
by the proof of Lemma (3.11), we can write 

^a(m) ^ ^m^ 

provided we use -0 to write in terms of xq, . . . , Xjv- It follows that 
$(xo, ...,xn) = (n^i 4>Aii’{xo, • • • , a:Af)). 

Since we are working in projective space, we conclude that ^ = 0^ o 
Using Remark 8.8 of [BC], we can identify with the restriction of 
U — > Xq to (C*)^. It follows from [Cox] (especially the discussion following 
Theorem 2.1) that ^ is onto, and it follows that 

which completes the proof of the theorem. □ 

The proof of Theorem (3.13) shows that the map $ : C/ — > is sur- 

jective, which allows us to think of the facet variables as “homogeneous 
coordinates” on However, for this to be useful, we need to understand 
when two points P,Q E U correspond to the same point in Xj\. In nice 
cases, there is a simple description of when this happens (see Theorem 2.1 
of [Cox]), but in general, things can be complicated. 

There is a lot more that one can say about sparse resultants and toric 
varieties. In Chapter 8, we will discover a different use for toric varieties 
when we study combinatorial problems arising from magic squares. Toric 
varieties are also useful in studying solutions of sparse equations, which we 
will discuss in §5, and the more general sparse resultants defined in §6 also 
have relations to toric varieties. But before we can get to these topics, we 
first need to learn more about polytopes. 



Additional Exercises for §3 

Exercise 6. Consider the Veronese map ^ P^“^, N = (’^J^), 

as in the text. 

a. Show that is well-defined. This has two parts: first, you must show 

that . . . , Xn) doesn’t depend on which homogeneous coordinates 

you use, and second, you must show that ^d(xo, . . . , x^) never equals 
the zero vector. 

b. Show that ^d is one-to-one. Hint: If ^d(xo, . . . , Xn) = ^divoi • • • ? 2/n)? 
then for some /x, /xx" = for all \a\ = d. Pick i such that Xi ^ 0 and 
let A = Vilxi. Then show that /x = and yj = Xxj for all j. 

c. Prove (3.5). 

d. Prove that ^d(P’^) is the Zariski closure of ((C*)’^) in P^“^. In con- 
crete terms, this means the following. Let the homogeneous coordinates 
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on be 111 , . . . , iXiv- If a homogeneous polynomial . ^un) 

vanishes on prove that H vanishes on Hint: 

Use (3.5) to show that xq . . . XnH o vanishes identically on P^. Then 
argue that H o must vanish on P^. 



Exercise 7. Let A and F, G, if be as in Exercise 1. In this exercise and 
the next, you will give an elementary proof that Res^(F, G, if) = 0 if and 
only ifF = G = if = 0 has a nontrivial solution (zi, 5, v, t), meaning 
(ti, s) ^ (0, 0) and (t;, t) ^ (0, 0). 

a. If F = G = H = 0 has a nontrivial solution (u, s, v, f), show that the 
determinant in (2.11) vanishes. Hint: Multiply the equations by u and 5 
to get 6 equations in the 6 “unknowns” usv, ust^ s^t. Show 
that the “unknowns” can’t all vanish simultaneously. 

b. For the remaining parts of the exercise, assume that the determinant 
(2.11) vanishes. We will find a nontrivial solution of the equations F = 
G = ff = 0 by considering 3x3 submatrices (there are four of them) 
of the matrix 

( ao CLl CL2 
bo bi 62 bs 

Co Cl C2 C3 

One of the 3x3 submatrices appears in (2.2), and if its determinant 
doesn’t vanish, show that we can find a solution of the form (1, s, 1, f). 
Hint: Adapt the argument of Exercise 2 of §2. 

c. Now suppose instead that 



det 



do 


d2 


d3 \ 


bo 


62 


0 

eo 

-0 


Co 


C2 


C 3 / 



Show that we can find a solution of the form (ii, 1, 1, f). 

d. The matrix of part b has two other 3x3 submatrices. Show that we can 
find a nontrivial solution if either of these has nonvanishing determinant. 

e. Conclude that we can find a nontrivial solution whenever the matrix of 
part b has rank 3. 

f. If the matrix has rank less than three, explain why it suffices to show 
that the equations F = G = 0 have a nontrivial solution. Hence we 
are reduced to the case where H is the zero polynomial, which will be 
considered in the next exercise. 



Exercise 8. Continuing the notation of the previous exercise, we will show 
that the equations F = G = 0 always have a nontrivial solution. Write the 
equations in the form 

{aou H- ais)v + {a2U + ass)t = 0 
{bou + bis)v + (62U + b^s)t = 0, 
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which is a system of two equations in the unknowns t, 

a. Explain why we can find (t/o, sq) ^ (0, 0) such that 

/ no^o + <^ 1^0 ^ 2^0 ■+■ dsSo 

\ boUo H" &i5o b2Uo + 63 S 0 

b. Given {uq^sq) from part a, explain why we can find {vo,to) ^ (0,0) 
such that (uo, so> '^Oj ^ 0 ) is a nontrivial solution of F = G = 0. 

Exercise 9. In Exercise 8 of §2, you showed that Res^ is unchanged if we 
translate ^ by a vector v e Z'^. You also know that if Q is the convex hull 
of A, then v + Q is the convex hull of v + ^ by Exercise 16 of § 1 . 

a. If Q is represented as in (3.7), show that v + Q is respresented by the 
inequalities m • Vj > —aj v • Uj. 

b. Explain why A and v + A have the same facet variables. 

c. Consider m G Q. Show that the homogenization of with respect to 
A is equal to the homogenization of with respect to v -\r A. This 
says that the homogenized equations in Theorem (3.13) are unchanged 
if we replace A with v A. 

Exercise 10 . Let xi, . . . , a:iv be facet variables for Q == Conv(*A). We say 
that two monomials and have the same A-degree if there is m G 
such that 

A =ocj+m- Uj 

for j = 1, . . . , N. 

a. Show that the monomials m G Q, have the same ^-degree. Thus 

the polynomials in Theorem (3.13) are A-homogeneous^ which means 
that all terms have the same .A-degree. 

b. If Ad and xq, • . . are as in part a of Exercise 4, show that two 
monomials and x^ have the same ^^^-degree if and only if they have 
the same total degree. 

c. If A and s, v, t are as in part b of Exercise 4, show that two monomials 

^ai ^ 02 ^ 03^04 u^isb2yb3fb4 samo A-dogroo if and only if 

Oil cl 2 = bi -{■ 62 and <13 + U4 = 63 -f- 64. 




Exercise 11. This exercise will explore the notion of “nontrivial” given 
in Definition (3.12). Let m G Q — Conv(A), and let x\,...^xn be the 
facet variables. We define the reduced monomial x^^^^ to be the monomial 
obtained from by replacing all nonzero exponents by 1. 

a. Prove that 



X 



a(m) 

red 






Thus is the product of those facet variables corresponding to the 

facets not containing m. Hint: Look at the proof of Lemma (3.10) and 
remember that m G Fj if and only if m ^ uj = —aj. 
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b. Prove that (xi, . . . , Xjv) is nontrivial if and only if ^ 0 for all 

vertices m G Q. 

c. Prove that if m G Q D is arbitrary, then is divisible by some 

reduced vertex monomial. Hint: The face of Q of smallest dimension 
containing m is the intersection of those facets Tj for which m-Uj = —aj. 
Then let m' be a vertex of Q lying in this face. 

d. As in the proof of Theorem (3.13), let U C be the set of non- 
trivial points. If (xi, . . . ,Xn) ^ U, then use parts b and c to show 
that (xi,...,Xn) is a solution of the homogenized equations Fq — 

• • • = Fn = 0 in the statement of Theorem (3.13). Thus the points 
in — U are “trivial” solutions of our equations, which explains the 
name “nontrivial” for the points of U. 

Exercise 12. Let A — {(0, 0), (1, 0), (0, 1), (1, 1), (2, 0)}. In Exercise 9 of 
§2, you showed that Res^(/, g, h) was given by a certain 9x9 determinant. 
The convex hull of A is pictured in Fig. 7.2, and you computed the inward 
normals to be ei, C2, —62, —61 — 62 in Exercise 6 of §1. Let the corresponding 
facet variables be xi, X2, xs, X4. 

a. What does it mean for (xi, X2, X3, X4) to be nontrivial? Try to make 
your answer as nice as possible. Hint: See part b of Exercise 5. 

b. Write down explicitly the homogenizations F, G, H of the polynomials 
/, h from Exercise 9 of §2. 

c. By combining parts a and b, what is the condition for Res>i(F, G, H) 
to vanish? 

Exercise 13. In Exercise 10 of §2, you studied the Dixon resultant 
Res^^ where Ai^m = {(^^j b) G : 0 < a < I, 0 < b < m}. 

a. Draw a picture of Conv(^/,m) and label the facets using the variables 
u, s, u, t (this is similar to what you did in part b of Exercise 4). 

b. What is the homogenization of f G L{Ai^m)^ 

c. What does it mean for (u, 5, u, t) to be nontrivial? 

d. What is the toric variety ^? Hint: It’s one you’ve seen before! 

e. Explain how the Dixon resultant can be formulated in terms of bihomo- 
geneous polynomials, A polynomial / G fc[u, 5, v, t] is bihomogeneous of 
degree (Z, m) if it is homogeneous of degree I as a polynomial in u, s and 
homogeneous of degree m as a polynomial in u, t. 



§4 Minkowski Sums and Mixed Volumes 

In this section, we will introduce some important constructions in the 
theory of convex polytopes. Good general references for this material are 
[BoF], [BZ], [Ewa] and [Lei]. [Ful] and [GKZ] also contain brief expositions. 
Throughout, we will illustrate the main ideas using the Newton polytopes 
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(see §1) of the following polynomials: 

/i(x, y) = ax^y^ + bx + cy^ -\-d 

f ( \ 4 . f 3 , 

J 2 {x, y) = exy + /x + gy. 

We will assume that the coefficients a, ... ,g are all non-zero in C. 

There are two operations induced by the vector space structure in 
that form new polytopes from old ones. 

(4.2) Definition. Let P, Q be polytopes in and let A > 0 be a real 
number. 

a. The Minkowski sum of P and Q, denoted P -h Q, is 

P + Q = {p-^q:p^P and q G Q}, 

where p q denotes the usual vector sum in 

b. The polytope A P is defined by 

AP-{Ap:pGP}, 

where Ap is the usual scalar multiplication on R’^. 

For example, the Minkowski sum of the Newton polytopes Pi = 
NP(/i) and P 2 = NP(/ 2 ) from (4.1) is a convex heptagon with vertices 
(0, 1), (3, 0), (4, 0), (6, 2), (4, 6), (1, 6), and (0, 3). In Fig. 7.6, Pi is indicated 
by dashed lines, P 2 by bold lines, and the Minkowski sum Pi -fP 2 is shaded. 




Figure 7.6. Minkowski Sum of Polytopes 
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Exercise 1. In Fig. 7.6, show that the Minkowski sum P\ + P 2 can be 
obtained by placing a copy of P\ at every point of P 2 . Illustrate your answer 
with a picture. This works because Pi contains the origin. 

Exercise 2. Let 

fi = a2QX^ + CLiixy -f ao22/^ + ai^x -f a^iy + aoo 

/2 = hox^ + h 2 ix‘^y + hi 2 xy^ + 6302/^ + H 1- &00 

be general (“dense”) polynomials of total degrees 2 and 3 respectively. 
Construct the Newton polytopes Pi = NP(/i) for i = 1,2 and find the 
Minkowski sum Pi -I- P 2 . 

Exercise 3. 

a. Show that if /i, /2 C C[xi , . . . , Xn] and Pi = NP(/i), then Pi + P 2 = 
NP(/l • /2). 

b. Show in general that if Pi and P 2 are polytopes, then their Minkowski 
sum Pi + P 2 is also convex. Hint: If Pi = Conv(^i), where Ai is finite, 
what finite set will give Pi + P 2 ? 

c. Show that a Minkowski sum of lattice polytopes is again a lattice 
polytope. 

d. Show that P + P = 2 P for any polytope P. How does this generalize? 

Given finitely many polytopes Pi,...,P/ in we can form their 
Minkowski sum Pi + • • • + P/, which is again a polytope in In §1, 
we learned about the faces of a polytope. A useful fact is that faces of the 
Minkowski sum Pi + • • • + P/ are themselves Minkowski sums. Here is a 
precise statement. 

(4.3) Proposition. Let Pi, , . . , Pr inW^ be polytopes in and let P = 
Pi + • • • -f Pr be their Minkowski sum. Then every face P' of P can be 
expressed as a Minkowski sum 

p' = p' + . . . + p;, 
where each P/ is a face of Pi . 

Proof. By §1, there is a nonzero vector u eW^ such that 
P' = Pj^ = P n {m : m ’ u = —ap{u)}. 

In Exercise 12 at the end of the section, you will show that 
Pi/ = (Pi + • ■ • + Pr)i/ = (A)i/ + • • • + (-fr)i/j 
which will prove the proposition. □ 

Exercise 4. Verify that Proposition (4.3) holds for each facet of the 
Minkowski sum Pi -f P 2 in Fig. 7.6. 




§4. Minkowski Sums and Mixed Volumes 319 



We next show how to compute the volume of a n-dimensional lattice 
polytope P using its facets. As in §1, each facet P oi P has a unique 
primitive inward pointing normal ujr G If the supporting hyperplane 
of P is m • ujr = — then the formula (1.4) for P can be stated as 

(4.4) P = P|{^ G : m • i/jT > — a^}, 

T 

where the intersection is over all facets P of P. Recall also that in the 
notation of (1.3), = ap{ujr). 

Let denote the (n — l)-dimensional subspace defined by m • up = 0. 
Then flZ^ is closed under addition and scalar multiplication by integers. 
One can prove that fl Z’^ is a lattice of rank n — 1, which means there are 
n — 1 vectors , Wn-i G fl Z^ such that every element of z/^ fl Z’^ 

is a unique linear combination of . . . ,Wn-i with integer coefficients. 
We call if;i, . . . , Wn-i a basis of z/^ fl Z^. The existence of . . . , Wn-i 
follows from the fundamental theorem on discrete subgroups of Euclidean 
spaces. Using t(;i, . . . , Wn-i’, we get the set 

P — -f- • • • + • 0 ^ ^ l}j 

which is the called a fundamental lattice parallelotope of the lattice u^nZ'^. 

If S is subset of lying in any affine hyperplane, we can define 
the Euclidean volume Voln-i(S'). In particular, we can define Voln-i(P). 
However, we also need to take the volume of the fundamental lattice 
parallelotope V into account. This leads to the following definition. 



(4.5) Definition. The normalized volume of the facet P of the lattice 
polytope P is given by 






Voln-liV) ’ 



where P is a fundamental lattice parallelotope for D Z^. 



This definition says that the normalized volume is the usual volume 
scaled so that the fundamental lattice parallelotope has volume 1. In 
Exercise 13, you will show that this definition is independent of which fun- 
damental lattice parallelotope we use. We should also mention the following 
nice formula: 



VoU_i(P) = ||z/^||, 

where ||z/^|| is the Euclidean length of the vector up. We omit the proof 
since we will not use this result. 

For example, let P 2 = NP(/ 2 ) = Conv({(l, 4), (3, 0), (0, 1)}) be the 
Newton polytope of the polynomial /2 from (4.1). For the facet 

P=Conv({(3,0),(0,l)}), 
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we have = (1,3), and line containing !F \sx-\-^y = 3. It is easy to check 
that (3, 0) and (0, 1) are as close together as any pair of integer points in 
the line a: 4- 3?/ = 3, so the line segment from (3, 0) to (0, 1) is a translate 
of the fundamental lattice parallelotope. It follows that 

Vol'i(^) = 1. 

Notice that the usual Euclidean length of T is \/l0. In general, the 
normalized volume differs from the Euclidean volume. 

Exercise 5. Let = NP(/ 2 ) be as above. 

a. Show that for the facet Q — Conv({(3, 0), (1, 4)}), we have vq = 
(-2,-1) and Vol'i(^?) = 2. 

b. Finally, for the facet H — Conv({(0, 1), (1, 4)}), show that vji = (3, —1) 
and Vol'i(W) = 1. 

Our main reason for introducing the normalized volume of a facet is the 
following lovely connection between the n-dimensional volume of a polytope 
and the (n — l)-dimensional normalized volumes of its facets. 

(4.6) Proposition. Let P he a lattice polytope in and assume that P 
is represented as in ( 4 - 4 )- Then 

Vol„(P) = - Va^Voi;_i(JT), 
n ^ 

where the sum is taken over all facets of P. 

Proof. See [BoF], [Lei] or [Ewa], Section IV.3. The formula given in these 
sources is not specifically adapted to lattice polytopes, but with minor 
modifications, one gets the desired result. Note also that this proposition 
explains the minus sign used the equation m • i/yr > — of a supporting 
hyperplane. □ 

For an example of Proposition (4.6), we will compute the area of the 
polytope P 2 = NP(/ 2 ) of Exercise 5. First note that if we label the facet 
normals = (1? 3), i/g = (—2, —1) and un = (3, —1) as above, then P 2 
is defined by 



m • ujr > 3^ m • ug > —6, and m • vji > —1. 

It follows that ajr = -3, ag = 6 and = 1. Applying Proposition (4.6), 
the area of P 2 is given by 

(4.7) Vol2(P2) = (l/2)(-3 . 1 + 6 • 2 + 1 • 1) = 5. 

You should check that this agrees with the result obtained from the 
elementary area formula for triangles. 
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Exercise 6. Show that the area of the polytope Pi = NP(/i) for fi from 
(4.1) is equal to 4, by first applying Proposition (4.6), and then checking 
with an elementary area formula. 

Proposition (4.6) enables us to prove results about volumes of lattice 
poltyopes using induction on dimension. Here is a nice example which is 
relevant to Theorem (2.13). 

(4.8) Proposition. If P is a lattice polytope, then n\ Voln(P) is an 

integer. 

Proof. The proof is by induction on n. Then case n = 1 is obvious, so 
we may assume inductively that the result is true for lattice polytopes in 
R^“^. By Proposition (4.6), we get 

n\ Vol„(P) = • (n - 1)! Voi:._i(:F). 

T 

Note that is an integer. If we can show that (n — 1)! Vol^_i(.F) is an 
integer, the proposition will follow. 

A basis wi, , Wn-i of the lattice PI gives cf) : which 

carries H TP G to the usual lattice C Since the funda- 
mental lattice polytope V maps to {(ai, . . . , On-i) : 0 < < 1} under 0, 

it follows easily that 

Yot-i{S) = VoU_i((/>(5)), 

where Voln-i is the usual Euclidean volume in R^“^ By translating T, 
we get a lattice polytope T' C and then C R’^"^ is a lattice 

polytope in R’^”^. Since 

(n - 1)! VoCi(JP) = (n- 1)! VolUi(^') = (n - 1)! VoU_i(0(:p')), 
we are done by our inductive assumption. □ 

Our next result concerns the volumes of linear combinations of polytopes 
formed according to Definition (4.2). 

(4.9) Proposition. Consider any collection P\,...,Pr of polytopes in 
R^, and let . . . , Xr ^ he nonnegative. Then 

Voln(AiPi + • • • + XrPr) 

is a homogeneous polynomial function of degree n in the Xi . 

Proof. The proof is by induction on n. For n — the Pi = [£i,ri] 

are all line segments in R (possibly of length 0 if some = r^). The linear 

combination AiPi H h XrPr is the line segment Xiti, Yl- A^r^], whose 

length is clearly a homogeneous linear function of the A^. 
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Now assume the proposition has been proved for all combinations of 
polytopes in and consider polytopes Pi in and > 0. The 

polytope Q = AiPi + • • -h XrPr depends on Ai, . . . , A^, but as long as 
Ai > 0 for all z, the Q’s all have the same set of inward pointing facet 
normals (see Exercise 14 at the end of the section). Then, using the notation 
of (1.3), we can write the formula of Proposition (4.6) as 

(4.10) Vol„(Q) = 

u 

where the sum is over the set of common inward pointing facet normals u. 
In this situation, the proof of Proposition (4.3) tells us that 

Qu = Ai(Pi)i/ + • • • + \r{Pr)u- 

By the induction hypothesis, for each i/, the volume Vol^_i(Qi/) in (4.10) 
is a homogeneous polynomial of degree n — 1 in Ai, . . . , Ar (the details of 
this argument are similar to what we did in Proposition (4.8)). 

Turning to ag(z/), we note that by Exercise 12 at the end of the section, 

== ®AiPi+ H-Ar-Pr(^) = '^l«Pi(*^) + • • • + A^ap^(p). 

Since v is independent of the A^, it follows that uq(p) is a homogeneous 
linear function of Ai, . . . , A^. Multiplying ag(p) and Vol^_i(Qi,), we see 
that each term on the right hand side of (4.10) is a homogenous polynomial 
function of degree n, and the proposition follows. □ 

When r = n, we can single out one particular term in the polynomial 

Voln(AiPi H h XnPn) that has special meaning for the whole collection 

of poly topes. 

(4.11) Definition. The n-dimensional mixed volume of a collection of 
polytopes Pi, . . . , Pnj denoted 

M14(Pl,...,Pn), 

is the coefficient of the monomial Ai • A 2 • • • A^ in Voln(AiPi + • — h A^Pn)- 
Exercise 7. 

a. If Pi is the unit square Conv({(0, 0), (1, 0), (0, 1), (1, 1)}) and P 2 is the 
triangle Conv({(0, 0), (1, 0), (1, 1)}), show that 

Vol2(AiPi + A 2 P 2 ) = Af + 2 A 1 A 2 + |Al, 

and conclude that MV 2 {Pi, P 2 ) = 2. 

b. Show that if P^ = P for all i, then the mixed volume is given by 

MK(P,P,...,P) =n!VoU(P). 

Hint: First generalize part d of Exercise 3 to prove AiP + • • • + A„P = 
(Ai + •••-[- An) P, and then determine the coefficient of A 1 A 2 • • • An in 
(Ai + • • • 4- Xn)^’ 
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The basic properties of the n-dimensional mixed volume are given by the 
following theorem. 



(4.12) Theorem. 

a. The mixed volume MVn{Pi^ . . . , Pn) is invariant if the Pi are replaced 
by their images under a volume-preserving transformation of (for 
example, a translation). 

b. MVn{Pi, . . . , Pn) 'Is symmetric and linear in each variable. 

c. MVnlPi , . . . , Pn) > 0. Furthermore, MVn{Pi ^ . . . , Pn) = 0 if one of 
the Pi has dimension zero (i.e., if Pi consists of a single point), and 
MVn{Pi , . . . , Pn) > 0 if every Pi has dimension n. 

d. The mixed volume of any collection of polytopes can be computed as 

n 

MV„{Pu ...,Pn) = ^(-1)"-" E (E P) ’ 

k=l 

\I\=k 

where Pi is the Minkowski sum of polytopes. 

e. For all collections of lattice polytopes Pi, . . . , Pn, 

MV„(Pi, ...,Pn) = J2 apAi^)MV'_,i{P 2 ),, (Pn).), 



where ap^ (u) is defined in (1.3) and the sum is over all primitive vectors 
V e. 17^ such that {Pi)iy has dimension > 1 for i = 2, . . . ,n. The no- 
tation MV^_i((P 2 )z /5 • • • j (Pn)u) on the right stands for the normalized 
mixed volume analogous to the normalized volume in Definition (4-3): 



MK_i((P2)., . . . , (Pn)u) 



MVn-l{{P2)u,...,{Pn)u) 

Vol„_i(P) 



where V is a fundamental lattice parallelotope in the hyperplane 
orthogonal to v. 



Proof. Part a follows directly from the definition of mixed volumes, as 
does part b. We leave the details to the reader as Exercise 15 below. 

The nonnegativity assertion of part c is quite deep, and a proof can be 
found in [Ful], Section 5.4. This reference also proves positivity when the 
Pi all have dimension n. If Pi has dimension zero, then adding the term 
XiPi merely translates the sum of the other terms in AiPi -|- • — h A^Pn by 
a vector whose length depends on A^. The volume of the resulting polytope 
does not change, so that Voln(AiPi + • • • + AnPn) is independent of A^ 
Hence the coefficient of Ai • A 2 • • • A^ in the expression for the volume must 
be zero. 

For part d, see [Ful], Section 5.4. Part e is a generalization of the volume 
formula given in Proposition (4.6) and can be deduced from that result. 
See Exercises 16 and 17 below. Proofs may also be found in [BoF], [Lei] 
or [Ewa], Section IV.4. Note that by part b of the theorem, only 1 / with 
dim {Pi)iy > 0 can yield non-zero values for MlP_i((P 2 )zy, . . . , (Pn)i/)- □ 
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For instance, let’s use Theorem (4.12) to compute the mixed volume 
MV 2 {Pi, P 2 ) for the Newton polytopes of the polynomials from (4.1). In 
the case of two polytopes in the formula of part d reduces to: 

MV 2 {Pu P 2 ) = -Vol2(Pl) - Vol2(P2) + Vol2(Pl + P2). 

Using (4.7) and Exercise 5, we have Vol2(Pi) — 4 and Vol2(P2) = 5. The 
Minkowski sum Pi -h P2 is the heptagon pictured in Fig. 7.6 above. Its 
area may be found, for example, by subdividing the heptagon into four 
trapezoids bounded by the horizontal lines y — 0, 1, 2, 3, 6. Using that 
subdivision, we find 

Vol2(Pi + P2) = 3 -h 11/2 + 23/4 + 51/4 = 27. 

The mixed volume is therefore 

(4.13) MU2 (Pi, P 2) - -4 - 5 + 27 - 18. 

Exercise 8. Check the result of this computation using the formula of 
part e of Theorem (4.12). Hint: You will need to compute ap^ (^^)? clpi 
and ap^ (1/7^), where ^re the inward normals to the facets T^Q^'H 

ofP2. 



In practice, computing the mixed volume Ml^(Pi, . . . , P^) using the 
formulas given by parts d and e of Theorem (4.12) can be very time con- 
suming. A better method, due to Sturmfels and Huber [HuSl] and Canny 
and Emiris [EC] , is given by the use of a mixed subdivision of the Minkowski 
sum Pi 4- • • • + Pn- A brief description of mixed subdivisions will be given 
in §6. See Section 6 of [EC] for references to other methods for computing 
mixed volumes. 

Exercise 9. Let Pi, . . . , P^ be lattice polytopes in E’^. 

a. Prove that the mixed volume MVn{Pi, • . • , Pn) is an integer. 

b. Explain how the result of part a generalizes Proposition (4.8). Hint: Use 
Exercise 7. 

We should remark that there are several different conventions in the 
literature concerning volumes and mixed volumes. Some authors include 
an extra factor of 1/n! in the definition of the mixed volume, so that 
MV^(P, . . . , P) will be exactly equal to Voln(P). When this is done, the 
right side of the formula from part d of Theorem (4.12) acquires an extra 
1/n!. Other authors include the extra factor of n\ in the definition of Voln 
itself (so that the “volume” of the n-dimensional simplex is 1). In other 
words, care should be taken in comparing the formulas given here with 
those found elsewhere! 
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Additional Exercises for §4 

Exercise 10. Let Pi, . . . , be polytopes in This exercise will show 
that the dimension of AiPi H- • • • + A^Pr is independent of the the A^, 
provided all A^ > 0. 

a. If A > 0 and po G P, show that (1 — A)po + Aff (AP -\-Q) = Aff (P + Q). 
This uses the affine subspaces discussed in Exercises 12 and 13 of §1. 
Hint: (1 - A)po Xp q = X{p q) - X{po q) + Po + q- 

b. Conclude that dim(AP + Q) = dim(P + Q). 

c. Prove that dim(AiPi + • • • + A^-Pr) is independent of the the A^, provided 
all Xi > 0. 



Exercise 11. Let m • u = — ap(z/)bea supporting hyperplane of P = 
Conv(A), where A C R’^ is finite. Prove that 

Pi/ = Conv({m e A : m • u = — ap(i/)}). 

Exercise 12. Let ap(z/) = — min^nepC^ • p) be as in (1.3). 

a. Show that {XP)i, = XPy and a\p{v) = Xap(y). 

b. Show that (P + = P^, Qj, and ap^Q{u) = ap{iy) + ag(i/). 

c. Conclude that (AiPi H- • • • + XrPr)v = Ai(Pi)i/ + • • • + Xr{Pr)v and 

aAiPi+...+A,p,(i^) = Aiapi(p) H -f- A^ap^(p). 

Exercise 13. Let be the hyperplane orthogonal to a nonzero vector 
V G and let {w\^ . . . , Wn} and {tt?'i, . . . , be any two bases for 

the lattice D 

a. By expanding the w[ in terms of the show that there is an (n — 1) x 
(n — 1) integer matrix A — (a^j) such that w[ = 5^^=/ 

i = 1, . . . , n — 1. 

b. Reversing the roles of the two lattice bases, deduce that A is invertible, 
and A~^ is also an integer matrix. 

c. Deduce from part b that det(A) = ±1. 

d. Show that in the coordinate system defined by Wi, , Wn-i, A defines 
a volume preserving transformation from to itself. Explain why this 
shows that any two fundamental lattice parallelotopes in have the 
same (n — l)-dimensional volume. 

Exercise 14. Fix polytopes Pi, . . . , P^. in R’^ such that Pi + • • • + P^ has 
dimension n. Prove that for any positive reals Ai, . . . , A^-, the polytopes 
AiPi + • • • + A^P^ all have the same inward pointing facet normals. Illustrate 
your answer with a picture. Hint: If p is an inward pointing facet normal 
for Pi + * • • + Pr, then (Pi + • • • + Pr)i^ has dimension n — 1. This implies 
that (Pi)i/ + • • • + (Pr)i/ has dimension n — 1 by Exercise 12. Now use 
Exercise 10. 
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Exercise 15. 

a. Using Definition (4.11), show that the mixed volume , Pn) 

is invariant under all permutations of the Pi. 

b. Show that the mixed volume is linear in each variable: 

- \MVn{Pu . . . , Pn) + /iMT4(Pl, ...,P',...,Pn) 

for alH = 1, . . . , n, and all A, // > 0 in R. Hint: When z = 1, consider 
the polynomial representing Voln(A Pi + A' P{ + A2P2 + • • • + XnPn) 
and look at the coeSicients of AA2 • • • An and A'A2 • • • An- 



Exercise 16. In this exercise, we will consider several additional proper- 
ties of mixed volumes. Let P, Q be polytopes in R’^. 

a. If A, /X > 0 are in R, show that Voln(AP fiQ) can be expressed in 
terms of mixed volumes as follows: 

A E it) ...,P,Q,...,Q), 



where in the term corresponding to A:, P is repeated k times and Q 
is repeated n — k times in the mixed volume. Hint: By Exercise 7, 
n! Voln(A P -f- /i Q) = MVn{\ P + /Lt Q, . . . , A P + /x Q). 

b. Using part a, show that MVn{P, . . . , P,Q) (which appears in the term 
containing in the formula of part a) can also be expressed as 



(n — 1)! lim 

/X-.0+ 



Voln(P + /xQ) - Voln(P) 



c. Explain how part b gives an interpretation of the mixed volume 
MVn(P, . . . , P, Q) as a constant times the surface area of P. 



Exercise 17. In this exercise, we will use part b of Exercise 16 to prove 
part e of Theorem (4.12). Replacing Q by a translate, we may assume that 
the origin is one of the vertices of Q. 

a. Show that the Minkowski sum P -h fiQ can be decomposed into: a sub- 
polytope congruent to P, prisms over each facet P of P with height equal 
to fi'aQ{v) >0, where z/ = and other polyhedra with n-dimensional 
volume bounded above by a constant times fjfi. 

b. Prom part a, deduce that 

Vol„(P + ^lQ) = Vol„(P) + ^aQ(^)Voi;_i(P,) + 0(/x2) 

V 

c. Using part b of Exercise 16, show that 

MVn{P , . . . , P, 0) = (n - 1)! ^ aQ(i/)Vo4_i(P.), 

V 

where the sum is over the primitive inward normals v to the facets of 
P. 
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d. Now, to prove part e of Theorem (4.12), substitute 

P = X 2 P 2 + * • • -h XnPji 

and Q = P\ into the formula of part c and use Exercises 7 and 15. 

Exercise 18. Given polytopes Pi, . . . , in this exercise will show 
that every coefficient of the polynomial representing 

Voln(AiPi + • • • + XrPr) 

is given by an appropriate mixed volume (up to a constant). We will use 
the following notation. If a = (zi, . . . , i^.) G satisfies \a\ = n, then A" 
is the usual monomial in Ai, . . . , A^, and let a\ — zi!z2! • • • v!- Also define 

MVn{P; a) = MVniPu . . . , Pi, P2, • • . , P2, • • • , Pr, • • • , ^r), 

where Pi appears Zi times, P2 appears Z2 times, . . . , Pr appears v times. 
Then prove that 

VolniXiPi + ■ ■ ■ + XrPr) = Y] ^7 My„(P; a)A“. 

II 

|o:|=n 

Hint: Generalize what you did in part a of Exercise 16. 



§5 Bernstein’s Theorem 

In this section, we will study how the geometry of polytopes can be used 
to predict the number of solutions of a general system of n polynomial (or 
Laurent polynomial) equations fi{xi, . . . , Xn) = 0. We will also indicate 
how these results are related to a particular class of numerical root-finding 
methods called homotopy continuation methods. 

Throughout the section, we will use the following system of equations to 
illustrate the main ideas: 

0 = fi{x, y) = ax^y^ + bx + cy^ +d 
0 = h{x, y) = exy'^ + fx^ + gy, 

where the coefficients ... ,g are in C. These are the same polynomials 
used in §4. We want to know how many solutions these equations have. 
We will begin by studying this question using the methods of Chapters 2 
and 3, and then we will see that the mixed volume discussed in §4 has an 
important role to play. This will lead naturally to Bernstein’s Theorem, 
which is the main result of the section. 

Let’s first proceed as in §1 of Chapter 2 to find the solutions of (5.1). 
Since different choices of a, . . . , ^ could potentially lead to different num- 
bers of solutions, we will initially treat the coefficients a, . . . ,^ in (5.1) 
as symbolic parameters. This means working over the field C(a, . . . , 5 ^) of 
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rational functions in a, . . . , Using a lex Grobner basis to eliminate it 
is easy to check that the reduced Grobner basis for the ideal (/i, /2) in the 
ring C(a, . . . , p)[x, y] has the form 



(5.2) 



0 = y -\-Pi7{x) 
0 = Pl8(^), 



where Pirix) and Pisix) are polynomials in x alone, of degrees 17 and 
18 respectively. The coefficients in pu and pis are rational functions in 

. . . ,g. Grobner basis theory tells us we that can transform (5.2) back 
into our original equations (5.1), and vice versa. These transformations will 
also have coefficients in C(a, . . . , ^). 

Now assign numerical values in C to a, . . . , ^. We claim that for “most” 
choices of a, . . . , p G C, (5.1) is still equivalent (5.2). This is because trans- 
forming (5.1) into (5.2) and back involves a finite number of elements of 
C(a, . . . , ^). If we pick a, . . . , p G C so that none of the denominators ap- 
pearing in these elements vanish, then our transformations will still work 
for the chosen numerical values of a, . . . , ^. In fact, for most choices, (5.2) 
remains a Grobner basis for (5.1) — this is related to the idea of special- 
ization of a Grobner basis, which is discussed in Chapter 6, §3 of [CLO], 
especially Exercises 7-9. 

The equivalence of (5.1) and (5.2) for most choices of a, . . . , G C can 
be stated more geometrically as follows. Let denote the affine space 
consisting of all possible ways of choosing a, . . . , ^ G C, and let P be 
the product of all of the denominators appearing in the transformation of 
(5.1) to (5.2) and back. Note that P(a , . . • ,g) ^0 implies that all of the 
denominators are nonvanishing. Thus, (5.1) is equivalent to (5.2) for all 
coefficients {a, . . . ,g) G such that P(a , . . . , g) ^ 0. As defined in §5 
of Chapter 3, this means that the two systems of equations are equivalent 
generically. We will make frequent use of the term “generic” in this section. 



Exercise 1. Consider the equations (5.1) with symbolic coefficients. 

a. Using Maple or another computer algebra system, compute the exact 
form of the Grobner basis (5.2) and identify explicitly a polynomial P 
such that if P(a , . . . ,g) 7^ 0, then (5.1) is equivalent to a system of the 
form (5.2). Hint: One can transform (5.1) into (5.2) using the division 
algorithm. Going the other way is more difficult. The Maple package 
described in section on Maple in Appendix D of [CLO] can be used for 
this purpose. 

b. Show that there is another polynomial P' such that if P'(a, . . . , ^) 7^ 0, 
then the solutions lie in (C*)^, where as usual C* = C \ {0}. 

Since (5.2) clearly has at most 18 distinct solutions in C^, the same is true 
generically for (5.1). Exercise 8 will show that for generic (a, . . . ,p), pis 
has distinct solutions, so that (5.1) has precisely 18 solutions in the generic 
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case. Then, using part b of Exercise 1, we conclude that generically, (5.1) 
has 18 solutions, all of which lie in (C*)^. This will be useful below. 

We next turn to §5 of Chapter 3, where learned about Bezout’s Theorem 
and solving equations via resultants. Since the polynomials f\ and /2 have 
total degree 5, Bezout’s Theorem predicts that (5.1) should have at most 
5 • 5 = 25 solutions in P^. If we homogenize these equations using a third 
variable z, we get 

0 — Fi(a;, y) = ax^y^ + bxz"^ + cy^z^ + dz^ 

0 = f2{x, y) = exy* + fx^z'^ + gyz*. 

Here, solutions come in two flavors: affine solutions, which are the solutions 
of (5.1), and solutions “at oo”, which have z = 0. Assuming ae / 0 (which 
holds generically), it is easy to see that the solutions at oo are (0, 1, 0) and 
(1, 0, 0). This, combined with Bezout’s Theorem, tells us that (5.1) has at 
most 23 solutions in C^. 

Why do we get 23 instead of 18, which is the actual number? One way to 
resolve this discrepancy is to realize that the solutions (0, 1, 0) and (1, 0, 0) 
at oo have multiplicities (in the sense of Chapter 4) bigger than 1. By 
computing these multiplicities, one can prove that there are 18 solutions. 
However, it is more important to realize that by Bezout’s Theorem, generic 
equations /i = /2 = 0 of total degree 5 in x^y have 25 solutions in C^. 
The key point is that the equations in (5.1) are not generic in this sense — a 
typical polynomial f{x, y) of total degree 5 has 21 terms, while those in 
(5.1) have far fewer. In the terminology of §2, we have sparse polynomials — 
those with fixed Newton polytopes — and what we’re looking for is a sparse 
Bezout’s Theorem. As we will see below, this is precisely what Bernstein’s 
Theorem does for us. 

At this point, the reader might be confused about our use of the word 
“generic”. We just finished saying that the equations (5.1) aren’t generic, 
yet in our discussion of Grobner bases, we showed that generically, (5.1) 
has 18 solutions. This awkwardness is resolved by observing that generic is 
always relative to a particular set of Newton poly topes. To state this more 
precisely, suppose we fix finite sets A\, ... ^Ai C Z'^. Each At gives the set 
L{Ai) of Laurent polynomials 

/i = ^ Ci,aX°‘. 
aeAi 

Note that we can regard each L{Ai) as an affine space with the coefficients 
Ci^ct as coordinates. Then we can define generic as follows. 

(5.3) Definition. A property is said to hold generically for Laurent poly- 
nomials (/i, . • . fi) G L{Ai) X • • • X L{An) if there is a nonzero polynomial 
in the coefficients of the fi such that the property holds for all /i , . . . , // 
for which the polynomial is nonvanishing. 
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This definition generalizes Definition (5.6) from Chapter 3. Also ob- 
serve that by Exercise 10 of §1, the Newton polytope NP{fi) of a generic 
fi G L{Ai) satisfies NP{fi) = Conv(A)- Thus we can speak of generic 
polynomials with fixed Newton polytopes. In particular, for polynomials of 
total degree 5, Bezout’s Theorem deals with generic relative to the Newton 
polytope determined by all monomials x^y^ with i-\- j < 5, while for (5.1), 
generic means relative to the Newton polytopes of f\ and / 2 . The difference 
in Newton poly topes explains why there is no conflict between our various 
uses of the term “generic” . 

One also could ask if resultants can help solve (5.1). This was discussed 
in §5 of Chapter 3, where we usually assumed our equations had no so- 
lutions at oo. Since (5.1) does have solutions at oo, standard procedure 
suggests making a random change of coordinates in (5.1). With high prob- 
ability, this would make all of the solutions affine, but it would destroy 
the sparseness of the equations. In fact, it should be clear that rather than 
the classical multipolynomial resultants of Chapter 3, we want to use the 
sparse resultants of §2 of this chapter. Actually, we need something slightly 
more general, since §2 assumes that the Newton polytopes are all equal, 
which is not the case for (5.1). In §6 we will learn about more general sparse 
resultants which can be used to study (5.1). 

The above discussion leads to the first main question of the section. 
Suppose we have Laurent polynomials /i, . . . , /n G . . . , such 

that /i = • • • = /n = 0 have finitely many solutions in (C*)’^. Then we 
want to know if there a way to predict an upper bound on the number 
of solutions of /i = • • • = /n = 0 in (C*)^ that is more refined than 
the Bezout Theorem bound deg(/i) • deg(/ 2 ) • • • deg(/n). Ideally, we want 
a bound that uses only information about the forms of the polynomials 
fi themselves. In particular, we want to avoid computing Grobner bases 
and studying the ring A = C[a;i, . . . , Xn]/{fi, • • . , fn) as in Chapter 2 , if 
possible. 

To see how mixed volumes enter the picture, let Pi and P 2 denote the 
Newton polytopes of the polynomials /i ,/2 in (5.1). Referring back to 
equation (4.13) from the previous section, note that the mixed volume of 
these polytopes satisfies 

MV2{PuP2) = 18, 

which agrees with the number of solutions of the system (5.1) for generic 
choices of the coefficients. Surely this is no coincidence! As a further test, 
consider instead two generic polynomials of total degree 5. Here, the New- 
ton poly topes are both the simplex Q 5 C described in Exercise 2 of §1, 
which has volume Vol 2 (Q 5 ) = 25/2 by Exercise 3 of that section. Using 
Exercise 7 of §4, we conclude that 

MU2(Q5,Q5)-2Vol2(g5)-25, 
so that again, the mixed volume predicts the number of solutions. 
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Exercise 2. More generally, polynomials of total degrees di, . . . in 
... ^Xn have Newton polytopes given by the simplices • • • ? Qdr, 
respectively. Use the properties of mixed volume from §4 to prove that 

^^n{Qdi j • • • ) Qdn ) — * dfi^ 

SO that the general Bezout bound is the mixed volume of the appropriate 
Newton polytopes. 

The main result of this section is a theorem of Bernstein relating the 
number of solutions to the mixed volume of the Newton polytopes of the 
equations. A slightly unexpected fact is that the theorem predicts the num- 
bers of solutions in (C*)’^ rather than in C’^. We will explain why at the 
end of the section. 

(5.4) Theorem (Bernstein’s Theorem). Given Laurent polynomials 

/i, • • • j /n over C with finitely many common zeroes in let Pi = 

NP(/i) be the Newton polytope of fi in MT'. Then the number of com- 
mon zeroes of the fi in (C*)’^ is bounded above by the mixed volume 
MVn{Pi , . . . , Pn)- Moreover, for generic choices of the coefficients in the 
fi, the number of common solutions is exactly MVn{Pi^ . . . ,Pn). 

Proof. We will sketch the main ideas in Bernstein’s proof, and indicate 
how MVn{Pi, . . . ,Pn) solutions of a generic system can be found. However, 
proving that this construction finds all the solutions of a generic system 
in (C*)’^ requires some additional machinery. Bernstein uses the theory 
of Puiseux expansions of algebraic functions for this; a more geometric 
understanding is obtained via the theory of projective toric varieties. We 
will state the relevant facts here without proof. For this and other details 
of the proof, we will refer the reader to [Ber] (references to other proofs 
will be given below). 

The proof is by induction on n. For n = 1, we have a single Laurent poly- 
nomial f{x) = 0 in one variable. After multiplying by a suitable Laurent 
monomial x^, we obtain a polynomial equation 

(5.5) 0 = f{x) = x''f{x) = CjnX^ T- Cm-ix'^~^ H h Co, 

where m > 0. Multiplying by does not affect the solutions of f{x) = 0 
in C*. By the Fundamental Theorem of Algebra, we see that both (5.5) 
and the original equation / = 0 have m roots (counting multiplicity) in C* 
provided CmCo ^ 0. Furthermore, as explained in Exercise 8 at the end of 
the section, / as distinct roots when cq, . . . , Cm are generic. Thus, gener- 
ically, / = 0 has m distinct roots in C*. However, the Newton polytope 
P NP(/) is a translate of NP(/), which is the interval [0, m] in R. By 
Exercise 7 of §4, the mixed volume MV\{P) equals the length of P, which 
is m. This establishes the base case of the induction. 

The induction step will use the geometry of the Minkowski sum P = 
Pi H- • • • + P„. The basic idea is that for each primitive inward-pointing 
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facet normal r/ G of P, we will deform the equations /i = • • • = /n = 0 
by varying the coefficients until some of them are zero. Using the induction 
hypothesis, we will show that in the limit, the number of solutions of the 
deformed equations is given by 

(5.6) ap,(t/) MK_i((P2)., . . . , (Pn).), 

where is defined in (1.3) and MV^_i{{P 2 )u, • • • ? {Pn)u) is the nor- 

malized (n — 1) -dimensional mixed volume defined in Theorem (4.12). We 
will also explain how each of these solutions contributes a solution to our 
original system. Adding up these solutions over all facet normals i/ of P 
gives the sum 

(5.7) W ^K-i{{P2)., . . ■ , (Pn).) = MVniPl, Pn), 

V 

where the equality follows from Theorem (4.12). To complete the induction 
step, we would need to show that the total number of solutions of the 
original system in (C*)^ is generically equal to, and in any case no larger 
than, the sum given by (5.7). The proof is beyond the scope of this book, 
so we will not do this. Instead, we will content ourselves with showing 
explicitly how each facet normal v oiP gives a deformation of the equations 
/i = • • • = /^ = 0 which in the limit has (5.6) as its generic number of 
solutions. 

To carry out this strategy, let z/ G be the primitive inward-pointing 
normal to a facet of P. As usual, the facet is denoted P;^, and we know 
from §4 that 

Pu — {Pl)v + • • • + (Pn)i/j 

where {Pi)y is the face (not necessarily a facet) of the Newton polytope 
Pi = NP(/i) determined by i/. By §1, {Pi)i, is the convex hull of those a 
minimizing z/ • a among the monomials from fi. In other words, if the 
face {Pi)i, lies in the hyperplane m • z/ = — ap.(z/), then for all exponents a 
of fi, we have 

a . 1 / > -ap,{u), 

with equality holding if and only if a G {Pi)iy. This means that fi can be 
written 

(5.8) fi= ^ ^ 

Before we can deform our equations, we first need to change fi slightly. 
If we multiply fi by x"" for some o; G Pi, then we may assume that there 
is a nonzero constant term ci in /i. This means 0 G Pi, so that ap^ (z/) > 0 
by the above inequality. As noted in the base case, changing fi in this way 
affects neither the solutions of the system in (C*)’^ nor the mixed volume 
of the Newton poly topes. 
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We also need to introduce some new coordinates. In Exercise 9 below, 
you will show that since v is primitive, there is an invertible n x n integer 
matrix B such that u is its first row and its inverse is also an integer matrix. 
If we write B = (bij), then consider the coordinate change 

n 

(5.9) 

2=1 



This maps Xj to the Laurent monomial in the new variables i/i , . . . , i/n 
whose exponents are the integers appearing in the jth. column of the matrix 
—B. (The minus sign is needed because i/ is an inward-pointing normal.) 
Under this change of coordinates, it is easy to check that the Laurent 
monomial maps to the Laurent monomial where Ba is the usual 

matrix multiplication, regarding a as a column vector. See Exercise 10 
below. 

If we apply this coordinate change to fi, note that a monomial x^ 
appearing in the first sum of (5.8) becomes 



y 



— Ba 



_ “nM 

— Vi 






-y 



/3n 

n 



(for some integers /? 2 , . • • , )Sn) since u • a = —ap.{u) and i/ is the first row 
of B. Similarly, a monomial x^ in the second sum of (5.8) becomes 

• • • 2 /^", Pi < ap,{u). 



It follows from (5.8) that fi transforms into a polynomial of the form 

Qijv{y2-) • • • 5 2/71)2/1 • 

3<ap.{v) 



Note also that the Newton polytope of giv{y 2 ^ • • • j 2/n) Is equal to the image 
under the linear mapping defined by the matrix B of the face {Pi)y. 

Thus the equations /i = • • • = /n = 0 map to the new system 



0 = giv{y2, • ■ 


..sjsr''"’ 


+ E 


gijv{,y2i ■ 


• • , yn)y{ 






3<ap^{u) 






0 = g2v{y2, • ■ 




+ E 


g2ju{y2, • 


■ • , yn)lA 



(5.10) J<«P2(^) 



^ = gnu{y2, ■ ■ ■ ^yn)yl‘’"^''^ + XI 9nj,^{y2,- ■ ■ ,yn)y{ 

j<ap^(iy) 

under the coordinate change x" y~^^. As above, the constant term of 
fi is denoted Ci, and we now deform these equations by substituting 

Cl 2/1 



Cl I — > 
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in (5.10), where t is a new variable, and then multiplying the zth equation 
by To see what this looks like, first suppose that ap^{u) > 0. This 

means that in the first equation of (5.10), ci is the j = 0 term in the sum. 
Then you can check that the deformation has the effect of leaving ci and 
the Qii, unchanged, and multiplying all other terms by positive powers of t. 
It follows that the deformed equations can be written in the form 



(5.11) 



0 = 511/(52, • • 


• ) yn)yi^^^ ^ + Cl + 0 {t) 


0 = 521/(52, • • 


• + 0 {t) 


0 “ 5ni/(52, • 


• • ) 2/71)2/1^""^^^ + 0 {t), 



where the notation 0{t) means a sum of terms each divisible by t. 

When t = 1, the equations (5.11) coincide with (5.10). Also, from the 
point of view of our original equations fi = 0, note that (5.11) corresponds 
to multiplying each term in the second sum of (5.8) by a positive power of 
t, with the exception of the constant term ci of /i, which is unchanged. 

Now, in (5.11), let t ^ 0 along a general path in C. This gives the 
equations 

0 = 9iu{y2, ■■■, yn)vT'^''^ + Cl 

0 = 521 /( 2 / 2 , • • ■ ,yn)yT'‘^‘'^ 



0 = 5ni/(y2, • • ■ ,yn)yT"^''\ 

which, in terms of solutions in (C*)”^, are equivalent to 

0 = 511 /( 2 / 2 , • • • , 5n)2/l^'^‘'^ + Cl 
0 = 521/(52, • • • , 5 n) 

(5.12) 

0 — 9nv{y2’> • • • 5 Vn)' 

It can be shown that for a sufficiently generic original system of equations, 
the equations g 2 v — - ' = 9nv = 0 in (5.12) are generic with respect to 
B • (i^ 2 )i/j . . . , B • (Pn)i/‘ Hence, applying the induction hypothesis to the 
last n — 1 equations in (5.12), we see that there are 

MK-i(H.(P2).,...,H.(P,),) 

possible solutions ( 2 / 2 ? • • • ? 2/n) ^ (C*)^“^ of these n — 1 equations. In 
Exercise 11 below, you will show that 

MVn-l{B . (P2)m, • • • 1 ^ • (Bn).) = MK_i((P2)., • • • , (Pn).), 
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where MV^_i is the normalized mixed volume from Theorem (4.12). 

For each (^2, • • • , Vn) solving the last n — 1 equations in (5.12), there 
are ap^{u) possible values for yi G C* provided giu{y 2 ^ • • • j2/n) 7^ 0 and 
Cl ^ 0. This is true generically (we omit the proof), so that the total 
number of solutions of (5.12) is 

ap^v) MK_i((P2)., . . . , (P„U 
which agrees with (5.6). 

The next step is to prove that for each solution (yi, . . . , y^) of (5.12), one 
can find parametrized solutions (2/1 (^), . . . , 2/n(^)) of the deformed equations 

(5.11) satisfying (2/1(0 ), . . . , 2/n(0)) = (2/1, • • • , 2/n)- This step involves some 
concepts we haven’t discussed (the functions yi{t) are not polynomials in 
t), so we will not go into the details here, though the discussion following 
the proof will shed some light on what is involved. 

Once we have the parametrized solutions ( 2/1 (^), • • • 5 2/n(^))5 we can follow 
them back to t = 1 to get solutions ( 2 / 1 ( 1 ), ..., 2/n(l)) of (5.10). Since 
the inverse of the matrix B has integer entries, each of these solutions 
(yi(l), . . . , 2/n(l)) can be converted back to a unique (xi, . . . , Xn) using 
the inverse of (5.9) (see Exercise 10 below). It follows that the equations 

(5.12) give rise to (5.6) many solutions of our original equations. 

This takes care of the case when ap^{u) > 0. Since we arranged fi so 
that CLp^{y) > 0, we still need to consider what happens when ap^{v) = 0. 
Here, c\ lies in the first sum of (5.8) for /i, so that under our coordinate 
change, it becomes the constant term of g\y. This means that instead of 
(5.11), the first deformed equation can be written as 

0 = 5i</(2/2, ■■■,Vn) + 0{t) 

since ap^{u) = 0 and Ci appears in gu,. Combined with the deformed 
equations from (5.11) for 2 < z < n, the limit ast 0 gives the equations 

0 = , yn)vT'^''\ I <i<n. 

As before, the (C*)’^ solutions are the same as the solutions of the equations 
0 = 9 iu{y 2 , ■ ■ • ,2/n), 1 < i < n. 

However, one can show that gi^, is generic and hence doesn’t vanish at the 
solutions of p2i/ — •••== gniy = 0. This means that generically, the t — ^ 0 
limit of the deformed system has no solutions, which agrees with (5.6). 

We conclude that each facet contributes (5.6) many solutions to our 
original equations, and adding these up as in (5.7), we get the mixed volume 
MVn{Pi ^ . . . , P^). This completes our sketch of the proof. □ 

In addition to Bernstein’s original paper [Ber], there are closely related 
papers by Kushnirenko [Kus] and Khovanskii [Kho]. For this reason, the 
mixed volume bound MVn{Pi , ..., Pn) on the number of solutions given in 
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Theorem (5.4) is sometimes called the BKK bound. A geometric interpre- 
tation of the BKK bound in the context of toric varieties is given in [Ful] 
and [GKZ], and a more refined version can be found in [Roj3]. Also, [HuSl] 
and [Rojl] study the genericity conditions needed to ensure that exactly 
MVn{Pi^ . ‘ Pn) different solutions exist in (C*)’^. These papers use a 
variety of methods, including sparse elimination theory and toric varieties. 
The proof we sketched for the BKK bound uses the formula 

• MK_i((P 2)., . . . , (Pn).) = MVniPl, ...,Pn) 

from Theorem (4.12). If you look back at the statement of this theorem in 
§4, you’ll see that the sum is actually taken over all facet normals v such 
that (^2)1/? • • • 5 {Pn)v all have dimension at least one. This restriction on 
ly relates nicely to the proof of the BKK bound as follows. 

Exercise 3. In the proof of Theorem (5.4), we obtained the system (5.10) 
of transformed equations. Suppose that for some i between 2 and n, (Pi)i/ 
has dimension zero. Then show that in (5.10), the corresponding Qij^ consists 
of a single term, and conclude that in the limit (5.12) of the deformed 
equations, the last n — 1 equations have no solutions generically. 

Exercise 4. Consider the equations fi = f 2 = 0 from (5.1). In this 
exercise, you will explicitly construct the coordinate changes used in the 
proof of Bernstein’s theorem. 

a. Use the previous exercise to show that in this case, the vectors u that 
must be considered are all among the facet normals of the polytope 
P 2 = NP{f 2 ). These normals, denoted ug and z/7^, were computed 
in Exercise 5 of §4 and in the discussion proceeding that exercise. Also, 
the mixed volume MV 2 {Pi, P2) = 18 was computed in (4.13). 

b. Show that ap^(z/^) = 0. Hence the term from (5.7) with iy = uj: is zero. 

c. For iy = lyg, show that 

«=(1 ~ o ) 

has u as first row. Also show that has integer entries. 

d. Apply the corresponding change of variables 

X i-> y ^ z 

to (5.1). Note that we are calling the “old” variables x, y and the “new” 
ones z, w rather than using subscripts. In particular, z plays the role of 
the variable y\ used in the proof. 

e. After substituting d 1-^ d/t and z i-> z/t, multiply by the appropriate 
powers of t to obtain 

0 = aw~~^z^ + d -h • bw~^z^ -h t^ • cz^ 

0 = {ew~^ + fw~^)z^ -h t^ • gz. 
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f. Let t — > 0 and count the number of solutions of the deformed system. 
Show that this number equals ap^{Vg)MV{{Q). 

g. Finally, carry out steps c-f for the facet 7Y of P 2 > and show we obtain 
18 solutions. 

Exercise 5. Use Bernstein’s theorem to deduce a statement about the 
number of solutions in (C*)’^ of a generic system of Laurent polynomial 
equations /i = • • • = /n = 0 when the Newton polytopes of the fi are all 
equal. (This was the case considered by Khovanskii in [Kho].) 

Exercise 6. Use Bernstein’s Theorem and Exercise 2 to obtain a version 
of the usual Bezout theorem. Your version will be slightly different from 
those discussed in §5 of Chapter 3 because of the (C*)^ restriction. 

While the BKK bound tells us about the number of solutions in 
one could also ask about the number of solutions in C^. For exam- 
ple, for (5.1), we checked earlier that generically, these equations have 
MV 2 (Pi, P 2 ) = 18 solutions in either or (C*)^. However, some surprising 
things happen if we change the equations slightly. 

Exercise 7. Suppose that the equations of (5.1) are fi=f 2 = 0. 

a. Show that generically, the equations fi=xf 2 = 0 have 18 solutions in 
(C*)^ and 20 solutions in C^. Also show that 

MU2(NP(/i),NP(x/2)) - 18. 

Hint: Mixed volume is unaflPected by translation. 

b. Show that generically, the equations y fi = x f 2 = 0 have 18 solutions 
in (C*)^ and 21 solutions in C^. Also show that 

MU2(NP(y/i),NP(o:/2)) = 18. 

This exercise illustrates that multiplying fi and /2 by monomials changes 
neither the solutions in (C*)^ nor the mixed volume, while the number 
of solutions in can change. There are also examples, not obtained by 
multiplying by monomials, which have more solutions in than in (C*)’^ 
(see Exercise 13 below). The consequence is that the mixed volume is really 
tied to the solutions in (C*)^. In general, finding the generic number of 
solutions in is a more subtle problem. For some recent progress in this 
area, see [HuS2], [LW], [Rojl], [Roj3] and [RW]. 

We will conclude this section with some remarks on how the BKK bound 
can be combined with numerical methods to actually find the solutions of 
equations like (5.1). First, recall that for (5.1), Bezout’s Theorem gives the 
upper bound of 25 for the number of solutions, while the BKK bound of 18 is 
smaller (and gives the exact number generically). For the task of computing 
numerically all complex solutions of (5.1), the better upper bound 18 is 
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useful information to have, since once we have found 18 solutions, there 
are no others, and whatever method we are using can terminate. 

But what sort of numerical method should we use? Earlier, we discussed 
methods based on Grobner bases and resultants. Now we will say a few 
words about numerical homotopy continuation methods, which give another 
approach to practical polynomial equation solving. The method we will 
sketch is especially useful for systems whose coefficients are known only in 
some finite precision approximations, or whose coefficients vary widely in 
size. Our presentation follows [VVC]. 

We begin with a point we did not address in the proof of Theorem (5.4): 
exactly how do we extend a solution (2/1, . • • , 2/n) of (5.12) to a parametric 
solution (yi{t), . . . , yn{t)) of the deformed equations (5.11)? In general, the 
problem is to “track” solutions of systems of equations such as (5.11) where 
the coefficients depend on a parameter t, and the solutions are thought of 
as functions of t. General methods for doing this were developed by numer- 
ical analysts independently, at about the same time as the BKK bound. 
See [AG] and [Dre] for general discussion of these homotopy continuation 
methods. The idea is the following. For brevity, we will write a system of 
equations 



/l(^l) • • • 5 ^n) — * ■ * — /n(^l? • • • 5 ^n) 0 

more compactly as f{x) = 0. To solve f{x) = 0, we start with a second 
system g{x) = 0 whose solutions are known in advance. In some versions 
of this approach, g{x) might have a simpler form than f{x). In others, as 
we will do below, one takes a known system which we expect has the same 
number of solutions as f{x) = 0. 

Then we consider the continuous family of systems 

(5.13) 0 = h{x, t) = c(l - t)g{x) + tf{x), 

depending on a parameter t, where c G C is some constant which is chosen 
generically to avoid possible bad special behavior. 

When ^ = 0, we get the known system g{x) = 0 (up to a constant). 
Indeed, g{x) = 0 is often called the start system and (5.13) is called a 
homotopy or continuation system. As t changes continuously from 0 to 1 
along the real line (or more generally along a path in the complex plane), 
suppose the rank of the Jacobian matrix of h{x, t) with respect to x: 

is n for all values of t. Then, by the Implicit Function Theorem, if xq is a 
solution of g{x) = 0, we obtain a solution curve x{t) with x(0) = xq that 
is parametrized by algebraic functions of t. The goal is to determine the 
values of x{t) at t = 1, since these will yield the solutions of the system 
f{x) = 0 we are interested in. 
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To find these parametrized solutions, we proceed as follows. Since we 
want h{x{t)^t) to be identically zero as a function of its derivative 
^ h{x{t)^ t) should also vanish identically. By the multivariable chain rule, 
we see that the solution functions x{t) satisfy 

0 = J^h{x{t),t) = + ^{x{t),t), 

which gives a system of ordinary differential equations (ODEs): 

for the solution functions x{t). Since we also know the initial value o:(0) = 
Xo, one possible approach is to use the well-developed theory of numerical 
methods for ODE initial value problems to construct approximate solu- 
tions, continuing this process until approximations to the solution x(l) are 
obtained. 

Alternatively, we could apply an iterative numerical root-finding method 
(such as the Newton- Raphson method) to solve (5.13). The idea is take a 
known solutions of (5.13) for t = 0 and propagate it in steps of size At 
until t = 1. Thus, if we start with a solution xq = x(0) for t = 0, we can 
use it as the initial guess for solving 

h{x{At)^ At) = 0 

using our given numerical method. Then, once we have x(At), we use it as 
the initial guess for solving 



h{x{2At),2At) = 0 

by our chosen method. We continue in this way until we have solved 
h{x{l), 1) = 0, which will give the desired solution. This method works 
because x{t) is continuous a function of t, so that at the step with 
t = {k 1)A^, we will generally have fairly good estimates for initial 
points from the results of the previous step (i.e., for t = kAt), provided A^ 
is sufficiently small. 

When homotopy continuation methods were first developed, the best 
commonly-known bound on the number of expected solutions was the 
Bezout theorem bound. A common choice for g{x) was a random dense 
system with equations of the same total degrees as f{x). But many poly- 
nomial systems (for instance (5.1)) have fewer solutions than general dense 
systems of the same total degrees! When this is true, some of the numer- 
ically generated approximate solution paths diverge to infinity as t — > 1. 
This is because the start equations g{x) = 0 would typically have many 
more solutions than the sparse system f{x) = 0. Much computational effort 
can be wasted trying to track them accurately. 

As a result, the more refined BKK bound is an important tool in ap- 
plying homotopy continuation methods. Instead of a random dense start 
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system g{x) = 0, a much better choice in many cases is a randomly cho- 
sen start system for which the gi have the same Newton polytopes as the 
corresponding fi\ 



NPto) = NP(/0. 

Of course, the solutions of g{x) — 0 must be determined as well. Unless 
solutions of some specific system with precisely these Newton polytopes is 
known, some work must be done to solve the start system before the homo- 
topy continuation method can be applied. For this, the authors of [VVC] 
propose adapting the deformations used in the proof of Bernstein’s theo- 
rem, and applying a continuation method again to determine the solutions 
of g(x) == 0. A closely related method, described in [HuSl] and [VGC], uses 
the mixed subdivisions to be defined in §6. Also, some interesting numerical 
issues are addressed in [HV]. 

The geometry of polytopes provides powerful tools for understanding 
sparse systems of polynomial equations. The mixed volume is an efficient 
bound for the number of solutions, and homotopy continuation methods 
give practical methods for finding the solutions. This is an active area of 
research, and further progress is likely in the future. 



Additional Exercises for §5 

Exercise 8. If / G C[x] is a polynomial of degree n, its discriminant 
Disc(/) is defined to be the resultant 

Disc(/) = Resn,n-i(/, /')^ 

where /' is the derivative of /. One can show that Disc(/) 7^ 0 if and only 
if / has no multiple roots (see Exercises 7 and 8 of Chapter 3, §5 of [CLO]). 

a. Show that the generic polynomial / G C[x] has no multiple roots. Hint: 
It suffices to show that the discriminant is a nonzero polynomial in the 
coefficients of /. Prove this by writing down an explicit polynomial of 
degree n which has distinct roots. 

b. Now let Pis G C(a, . . . , g)[x] be the polynomial from (5.2). To show that 
Pis has no multiple roots generically, we need to show that Disc(pi8) is 
nonzero as a rational function of a, . . . , ^. Computing this discriminant 
would be unpleasant since the coefficients of pis are so complicated. So 
instead, take pis and make a random choice of a, . . . , ^. This will give a 
polynomial in C[x]. Show that the discriminant is nonzero and conclude 
that Pis has no multiple roots for generic a, . . . ,g. 

Exercise 9. Let z/ G be a primitive vector (thus u ^ 0 and the entries 
of u have no common factor >1). Our goal is to find an integer nxn matrix 
with integer inverse and u as its first row. For the rest of the exercise, we 




§5. Bernstein’s Theorem 341 



will regard i/ as a column vector. Hence it suffices to find an integer n x n 
matrix with integer inverse and u as its first column. 

a. Explain why it suffices to find an integer matrix A with integer inverse 
such that Au = ei, where e\ — (1,0, .. . ,0)^ is the usual standard 
basis vector. Hint: Multiply by A~^. 

b. An integer row operation consists of a row operation of the following 
three types: switching two rows, adding an integer multiple of a row to 
another row, and multiplying a row by dbl. Show that the elementary 
matrices corresponding to integer row operations are integer matrices 
with integer inverses. 

c. Using parts a and b, explain why it suffices to reduce v to e\ using 
integer row operations. 

d. Using integer row operations, show that v can be transformed to a vector 
(6i, . . . , bn)^ where h\ > 0 and b\ < bi for all i with bi ^ 0. 

e. With (hi, ... , bn)^ as in the previous step, use integer row operations 
to subtract multiples of hi from one of the nonzero entries h^, i > 1, 
until you get either 0 or something positive and smaller than hi. 

f. By repeatedly applying steps d and e, conclude that we can integer row 
reduce i/ to a positive multiple of ei. 

g. Finally, show that u being primitive implies that the previous step gives 
ei exactly. Hint: Using earlier parts of the exercise, show that we have 
Aiy = dei, where A has an integer inverse. Then use A~^ to conclude 
that d divides every entry of i/. 

Exercise 10. 

a. Under the coordinate change (5.9), show that the Laurent monomial x", 

a G maps to the Laurent monomial where Ba is the matrix 

product. 

b. Show that (5.9) actually induces a one-to-one correspondence between 
Laurent monomials in x and Laurent monomials in y. 

c. Show that (5.9) defines a one-to-one, onto mapping from (C*)’^ to itself. 

Also explain how gives the inverse mapping. 

Exercise 11. Show that 

MVn-l{B . (P 2 )., . . . , B . (PJ,) = MK_i((P 2)„ . . . , (Pn).), 

where the notation is as in the proof of Bernstein’s Theorem. 

Exercise 12. Consider the following system of three equations in three 
unknowns: 

0 = a\xy^z + b\x^ + c\y + d\z + e\ 

0 = 02 xyz ^ + 622/^ + C2 
0 = a ^ x ^ + 632/^ + C3Z. 

What is the BKK bound for the generic number of solutions in (C*)^? 
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Exercise 13. Show that generically, the equations (taken from [RW]) 
0 = ax^y + bxy^ + cx dy 
0 = ex^^y + fxy"^ -i- gx hy 
have 4 solutions in (C*)^ and 5 solutions in C^. 



§6 Computing Resultants and Solving Equations 

The sparse resultant Res^(/i, introduced in §2 requires that the 

Laurent polynomials /i , . . . , /n be built from monomials using the same 
set A of exponents. In this section, we will discuss what happens when we 
allow each fi to involve different monomials. This will lead to the mixed 
sparse resultant. We also have some unfinished business from §2, namely the 
problem of computing sparse resultants. For this purpose, we will introduce 
the notion of a mixed subdivision. These will enable us not only to compute 
sparse resultants but also to find mixed volumes and to solve equations 
using the methods of Chapter 3. 

We begin with a discussion of the mixed sparse resultant. Fix n + 1 finite 
sets ^, . . . , An C and consider n + 1 Laurent polynomials fi G L{Ai). 
The rough idea is that the resultant 

R^S^o, • (/o) • • • 5 fn) 

will measure whether or not the n + 1 equations in n variables 

(6.1) /o(^lj • • • j ^n) = * • * = /n(^l? • • • 5 ^n) — 0 

have a solution. To make this precise, we proceed as in §2 and let 
• • • j An^ d. L(^Aq^ X • • • X L{^An) 

be the Zariski closure of the set of all (/o, . . . , fn) for which (6.1) has a 
solution in (C*)’^. 

(6.2) Theorem. Assume that Qi = Conv(A) is an n-dimensional poly- 
tope for i = 0, . . . , n. Then there is an irreducible polynomial Res^o,...^^^ 
in the coefficients of the f such that 

(/O) • • • ) fn) d Z (-4o, • • • J An) ^ ^ R^S.Ao, • >-4n (/o> • • • j fn) — 0* 

In particular, if (6.1) has a solution (ti, . . . , tn) ^ (C*)’^, then 

R^S-4o,.”5‘4n (/05 • • • ) fn) 0* 

This theorem is proved in Chapter 8 of [GKZ] . Note that the mixed sparse 
resultant includes all of the resultants considered so far. More precisely, the 
(unmixed) sparse resultant from §2 is 

ReS^(/o, • • • , fn) — ^^^A,...,A{f0i • • • ) /n)j 




§6. Computing Resultants and Solving Equations 343 



and the multipolynomial resultant studied in Chapter 3 is 

(JFq j • • • 5 -^n) • • • 5 

where Ai = {m G Z>q : \m\ < di) and Fi is the homogenization of fi. 

We can also determine the degree of the mixed sparse resultant. In §2, we 
saw that the degree of Res^ involves the volume of the Newton polytope 
Conv(^). For the mixed resultant, this role is played by the mixed volume 
from §4. 

(6.3) Theorem. Assume that Qi = Conv(A) is n-dimensional for each 
i = 0, . . . , n and that is generated by the differences of elements in 
^U - • - U^n- Then, if we fix i between 0 andn, Res^o,...,^„ is homogeneous 
in the coefficients of fi of degree MVn{Qo, • • • ? Qi-i, Qi+i? • • • ? Qn)* Thus 

R^S.Aov5*4n (/o? • • • ? ^fii • • • j /n) 

^MK„(Qo,...,Qi-i,Qi+iv..,Qn)Res^^^ ^ 

A proof can be found in Chapter 8 of [GKZ]. Observe that this result 
generalizes both Theorem (3.1) of Chapter 3 and Theorem (2.9) of this 
chapter. There are also more general versions of Theorems (6.2) and (6.3) 
which don’t require that the Qi be n-dimensional. See, for instance, [Stu3]. 
Exercise 9 at the end of the section gives a simple example of a sparse 
resultant where all of the Qi have dimension < n. 

We next discuss how to compute sparse resultants. Looking back at 
Chapter 3, recall that there were wonderful formulas for multipolynomial 
case, but it general, computing these resultants was not easy. The known 
formulas for multipolynomial resultants fall into three main classes: 

• Special cases where the resultant is given as a determinant. This includes 
the resultants Res/,rn and Res 2 , 2,2 from §1 and §2 of Chapter 3. 

• The general case where the resultant is given as the GCD of n + 1 
determinants. This is Proposition (4.7) of Chapter 3. 

• The general case where the resultant is given as the quotient of two 
determinants. This is Theorem (4.9) of Chapter 3. 

Do sparse resultants behave similarly? In §2 of this chapter, we gave 
formulas for the Dixon resultant (see (2.12) and Exercise 10 of §2). Other 
determinantal formulas for sparse resultants can be found in [SZ] and [WZ], 
so that the first bullet definitely has sparse analogs. We will see below that 
the second bullet also has a sparse analog. However, as of this writing, it is 
not known if a sparse analog of the third bullet exists — there is no known 
systematic way of writing Res^, or more generally Res^g,...,^^(/o, . . . , /n), 
as a quotient of two determinants. 

We now introduce our main tool for computing sparse resultants. The 
idea is to subdivide the Minkowski sum Q = Qo + * * • + Qn in a special 
way. We begin with what it means to subdivide a polytope. 
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(6.4) Definition. Let Q C be a polytope of dimension n. Then a poly- 
hedral subdivision of Q consists of finitely many n-dimensional polytopes 

. . . , (the cells of the subdivision) such that Q = RiU • -U Rs and 
for i ^ the intersection Ri fl Rj is a face of both Ri and Rj. 

For example, Fig. 7.7 below shows three ways of dividing a square into 
smaller pieces. The first two are polyedral subdivisions, but the third isn’t 
since R\ fl R 2 is not a face of Ri (and R\ fl R 3 has a similar problem). 

We next define what it means for a polyhedral subdivision to be com- 
patible with a Minkowski sum. Suppose that Qi, . . . ,Qm are arbitrary 
polytopes in R^. 

(6.5) Definition. Let Q = Qi + • • • + Qm C R"" be a Minkowski sum 
of polytopes, and assume that Q has dimension n. Then a subdivision 

, . . . , of Q is a mixed subdivision if each cell Ri can be written as a 
Minkowski sum 



Ri — Fl + • • • + Fm 

where each Fi is a face of Qi and n = dim(Fi) + •••-!- dim(F^). 

Exercise 1. Consider the polytopes 

Pi = Conv((0,0),(l,0),(3,2),(0,2)) 

P2-Conv((0, 1), (3,0), (1,4)). 

The Minkowski sum P = Pi + P 2 was illustrated in Fig. 7.6 of §4. 

a. Prove that Fig. 7.8 on the next page gives a mixed subdivision of P. 

b. Find a different mixed subdivision of P. 

When we have a mixed subdivision, some of the cells making up the 
subdivision are especially important. 




Figure 7.7. Subdividing the Square 




§6. Computing Resultants and Solving Equations 



345 




Figure 7.8. Mixed Subdivision of a Minkowski Sum 



(6.6) Definition. Suppose that = Fi + • • • + is a cell in a mixed sub- 
division of Q = Qi -I \-Qm‘ Then R is called a mixed cell if dim(Fi) < 1 

for all i. 

Exercise 2. Show that the mixed subdivision illustrated in Fig. 7.8 has 
three mixed cells. 

As an application of mixed subdivisions, we will give a surprisingly easy 
formula for mixed volume. Given n polytopes Qi, . . . , Qn C we want 
to compute the mixed volume MVn{Qi^ . . . , Qn)- We begin with a mixed 
subdivision of Q = Qi + • • • -f Qn- In this situation, observe that every 
mixed cell R is a. sum of edges (because the faces Fi C Qi summing to R 
satisfy n = dim(Ei) + • • • + dim(Fn) and dim(Fj) < 1). Then the mixed 
cells determine the mixed volume in the following simple manner. 

( 6 . 7 ) Theorem. Given polytopes Qi,...,Qn C and a mixed sub- 
divsion of Q = Qi + • • • + Qn; the mixed volume MVn{Qi , . . . , Qn) 
computed by the formula 

MVn{Qi,...,Qn) = '£.^0\n{R), 

R 

where the sum is over all mixed cells R of the mixed subdivision. 
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Proof. This result was known in the polytope community for some time, 
though it was first written up in [Bet]. An independently discovered proof 
can be found in [HuSl], which includes not only the above formula but also 
formulas for computing the mixed volumes MV^(P; a) from Exercise 18 of 
§4 in terms of certain non-mixed cells in the mixed subdivision. □ 

One feature which makes Theorem (6.7) useful is that the volume of a 
mixed cell R is easy to compute. Namely, if we write ii = Fi + • • • + Fn 
as a sum of edges Fi and let Vi be the vector connecting the two vertices 
of Fi, then one can show that the volume of the cell is 

YolniR) = I det(A)|, 

where A is the nxn matrix whose columns are the edge vectors vi, ... ,Vn- 

Exercise 3. Use Theorem (6.7) and the above observation to compute the 
mixed volume MV 2 {Pi, P 2 ), where Pi and P 2 are as in Exercise 1. 

Theorem (6.7) has some nice consequences. First, it shows that the mixed 
volume is nonnegative, which is not obvious from the definition given in §4. 
Second, since all mixed cells lie inside the Minkowski sum, we can relate 
mixed volume to the volume of the Minkowski sum as follows: 

AfV^(Ql, . . . , Qn) < Voln(Ql + • • • + Qn)- 

By [Emil], we have a lower bound for mixed volume as well: 

MVniQi, . . . , Q„) > n! VVol„(Qi) • ■ • Vol„(g„). 

Mixed volume also satisfies the Alexandrov- Fenchel inequality, which is 
discussed in [Ewa] and [Ful]. 

Exericse 4. Work out the inequalties displayed above for the polytopes 
Pi and P 2 from Exercise 1. 

All of this is very nice, except for one small detail: how do we find mixed 
subdivisions? Fortunately, they are fairly easy to compute in practice. We 
will describe briefly how this is done. The first step is to “lift” the polytopes 
Qi? • • • 5 Qn C to by picking random vectors l\, . . . ,ln ^ and 
considering the polytopes 

Qi = {(v, • u) : t; G Qi} C E" X R = 

If we regard k as the linear map ^ R defined by v k • v, then Qi is 
the portion of the graph of k Ij^ing over Qi. ^ 

Now con^der the polytope Q = Qi H- • • • + Qn C We say that a 

facet P of Q is a lower facet if its outward-pointing normal has a negative 
-coordinate, where tn+i is the last coordinate of = R”^ x R. If the 
li are sufficiently generic, one can show that the projection R’^ onto 
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the first n coordinates carries the lower facets T <Z Q onto n-dimensional 
polytopes i? C Q = Qi + • • • + Qn? s^nd these polytopes form the cells of a 
mixed subdivision of Q. The theoretical background for this construction is 
given in [BS] and some nice pictures appear in [CE2] (see also [HuSl], [CEl] 
and [EC]). Mixed subdivisions arising in this way are said to be coherent. 

Exercise 5. Let Qi = Conv((0, 0), (1, 0), (0, 1)) be the unit simplex in the 
plane, and consider the vectors li = (0,4) and I 2 = (2, 1). This exercise 
will apply the above strategy to create a coherent mixed subdivision of 
Q = Qi + Q2, where Q2 = 

a. Write^i ai^ Q 2 as convex hulls of sets of three points, and then express 
Q = Qi + Q 2 as the convex hull of 9 points in 

b. In E^, plot the points of Q found in part a. Note that such each point 
lies over a point of Q. 

c. Find the lower facets of Q (there are 3 of them) and use this to determine 
the corresponding coherent mixed subdivision of Q. Hint: When one 
point lies above another, the higher point can’t lie in a lower facet. 

d. Show that choosing li = (1, 1) and I 2 = (2, 3) leads to a different 
coherent mixed subdivision of Q. 

Algorithms for computing mixed subdivisions and mixed volumes can be 
found in [EC]. It is known that computing mixed volume is #P-complete 
(see [Ped]). Being #P-complete is similar to being NP-complete — the dif- 
ference is that NP-complete refers to a class of hard decision problems, 
while #P-complete refers to certain hard enumerative problems. The pa- 
per [EC] also explains how to obtain publicly-available software for doing 
these computations. 

We now return to our original question of computing the mixed sparse 
resultant Res^o,...,^^(/o, . . . , /n). In this situation, we have n + 1 polytopes 
Qi = Conv(^i). Our goal is to show that a coherent mixed subdivision of 
the Minkowski sum Q = Qo + * * * + Qn gives a systematic way to compute 
the sparse resultant. 

To see how this works, first recall what we did in Chapter 3. If we think 
of the multipolynomial resultant KesdQ,...,dni^o^ • • • ? T^^) in homogeneous 
terms, then the method presented in §4 of Chapter 3 goes as follows: we 

fixed the set of monomials of total degree do H h dn —n and wrote this 

set as a disjoint union 5o U • • • U 5n. Then, for each monomial e Si, we 
multiplied Fi by x^fx^\ This led to the equations (4.1) of Chapter 3: 

{x"^/xf^)Fi = 0, x"^ e Si, i = 1, . . . , n. 

Expressing these polynomials in terms of the monomials in our set gave a 
system of equations, and the determinant of the coefficient matrix was the 
polynomial Dn in Definition (4.2) of Chapter 3. 

By varying this construction slightly, we got determinants Do, ... , Dn 
the following two properties: 
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• Each Di is a nonzero multiple of the resultant. 

• For i fixed, the degree of as a polynomial in the coefficients of is 
the same as the degree of the resultant in these coefficients. 

(See §4 of Chapter 3, especially Exercise 7 and Proposition (4.6)). From 
here, we easily proved 

= ±GCD(Do, . . . , Dn), 

which is Proposition (4.7) of Chapter 3. 

We will show that this entire framework goes through with little change 
in sparse case. Suppose we have exponent sets w4o, • • . , An^ and as above set 
= Conv(^i). Also assume that we have a coherent mixed subdivision 
of Q = Qo + • • • + Qn- The first step in computing the sparse resultant is 
to fix a set of monomials or, equivalently, a set of exponents. We will call 
this set and we define £ to be 

f = Z^n(Q + (5), 

where 5 G is a small vector chosen so that for every a G there is a 
cell R of the mixed subdivision such that a lies in the interior oi R + 6. 
Intuitively, we displace the subdivision slightly so that the lattice points lie 
in the interiors of the cells. 

The following exercise illustrates what this looks like in a particularly 
simple case. We will refer to this exercise several times as we explain how 
to compute Res^o,,..,^^. 

Exercise 6. Consider the equations 

0 = /o = aix + a 2 V + 

0 = /i = + 622/ + ^^3 

0 = f2 = cix"^ + C 22 /^ + C 3 + c^xy -f C 5 X + CQy 

obtained by setting 2; = 1 in equations (2.9) from Chapter 3. If At is the 
set of exponents appearing in fi, then ResAo,Ai,A 2 is the resultant Resi,i,2 
considered in Proposition (2.10) of Chapter 3.6.7 

a. If we let Iq = (0,4), = (2,1) and I 2 = (5,7), then show that we 

get the coherent mixed subdivision of Q pictured in Fig. 7.9 on the 
next page. This calculation is not easy to do by hand — you should use 
a program such as qhull (available from the Geometry Center at the 
University of Minnesota) to compute convex hulls. 

b. If 6 = (e, e) for small e > 0, show that £ contains the six exponent 
vectors indicated by dots in Fig. 7.9. We will think of as consisting of 
the monomials 



x^y, x^y"^, x^y, xy^, xy^, xy. 

The reason for listing the monomials this way will soon become clear. 
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Figure 7.9. A Coherent Mixed Subdivision and its Shift 



c. If 6 = (— e, — e) for small e > 0, show that E consists of 10 exponent 
vectors. So different 5’s can give very different f ’s! 

Now that we have E^ our next task is to break it up into a disjoint union 
5o U • • • U ^n. This is where the coherent mixed subdivision comes in. Each 
cell R of the subdivision is a Minkowski sum 

= Eq + • • • + 

where the Fi C Qi are faces such that n = dim(Fo) -f • • • + dim(Fn). Note 
that at least one Fi must have dim{Fi) = 0, i.e., at least one Fi is a vertex. 
Sometimes R can be written in the above form in several ways (we will 
see an example below), but using the coherence of our mixed subdivision, 
we get a canonical way of doing this. Namely, R is the projection of a 
lower facet F C Q, and one can show that T can be uniquely written as a 
Minkowski sum 



where Fi is a face of Qi. If Fi C Qi is the projection of then the 
induced Minkowski sum F = Fq 4- • • • + is called coherent. Now, for 
each i between 0 and n, we define the subset Si C E as follows: 

Si = {a e E : if a e R 6 and F = Fq + • • • + F^ is coherent, 

( 6 . 8 ) 

then i is the smallest index such that Fi is a vertex}. 

This gives a disjoint union E = Fo U • • • U 5^. Furthermore, if a G Si, we let 
v{a) denote the vertex Fi in (6.8), i.e., Fi = (t;(a)}. Since Qi = Conv(A)j 
it follows that v{a) G At. 
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Exercise 7. For the coherent subdivision of Exercise 6, show that 
So = {x^y,x^y^,x^y}, Si = {xy^,xy^}, S2 = {xy}, 

and that 



{ X for G So 
y for x^ G S\ 

1 for G S'2- 

(Here, we regard £ and the Si as consisting of monomials rather than 
exponent vectors.) Hint: The exponent vector a — (1,3) of xy^ lies in 
i?2 + where we are using the labelling of Fig. 7.9. If T is the lower facet 
lying over i?2, one computes (using a program such as qhull) that 

T = edge of Qq -f (0, 1, 1) 4- edge of Q 2 

which implies that R 2 = edge of Qo + (0, 1) 4-edge of Q 2 is coherent. Thus 
xy^ G Si and = y, and the other monomials are handled similarly. 

The following lemma will allow us to create the determinants used in 
computing the sparse resultant. 

(6.9) Lemma. If a e Si, then {x"^ /x^^^^)fi G L{£). 

Proof. If ck g i? 4- ^ = Fo 4- * * * -h Fji -h h, then oi — Po -h * * * -h Pn ^5 
where Pj G Fj C Qj for 0 < j < n. Since a E Si, we know that Fi is the 
vertex v{a), which implies pi = v{a). Thus 

Oi = Po Pi— I 4 - v(pi) -h Pi^i 4“ • • • “h Pn d" 

It follows that \i P E Ai, then the exponent vector of {x^ / x'^^^^)x^ is 

OC — v{(x) P = Po Pi—l 4- /3 4" Pi-\-l • H- Pn + ^ C Q 4“ <5. 

This vector is integral and hence lies in 5 = fl (Q 4- ^). Since fi is a 
linear combination of the x^ for P E Ai, the lemma follows. □ 

Now consider the equations 

(6.10) = 0, as Si. 

We get one equation for each a, which means that we have \£\ equations, 
where \£\ denotes the number of elements in £. By Lemma (6.9), each 
{x^/x'^ can be written as a linear combination of the monomials x^ 
for P E £. If we regard these monomials as “unknowns”, then (6.10) is a 
system of \£\ equations in \£\ unknowns. 

(6.11) Definition. Dn is the determinant of the coefficient matrix of the 
\£\ X \£\ system of linear equations given by (6.10). 
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Notice the similarity with Definition (4.2) of Chapter 3. Here is a specific 
example of what this determinant looks like. 

Exercise 8. Consider the polynomials /o, /i, /2 from Exercise 6 and the 
decomposition E = Sq U Si \J S 2 from Exercise 7. 

a. Show that the equations (6.10) are precisely the equations obtained 
from (2.11) in Chapter 3 by setting z = 1 and multiplying each equa- 
tion by xy. This explains why we wrote the elements of £ in the order 
x^y, x'^y, xy^, xy“^, xy. 

b. Use Proposition (2.10) of Chapter 3 to conclude that the determinant 
D 2 satisfies 

D 2 = =taiResi,i,2(/o7 /i, /2)- 

This exercise suggests a close relation between and Res^o^..,,^^. In 
general, we have the following result. 

(6.12) Theorem. The determinant Dn is a nonzero multiple of the mixed 
sparse resultant Res An’ Furthermore, the degree ofDn as a polynomial 
in the coefficients of fn is the mixed volume MVniQo , . . . , Qn-i)- 

Proof. If the equations /o = • • • = /n = 0 have a solution in (C*)’^, 
then the equations (6.10) have a nontrivial solution, and hence the coeffi- 
cient matrix has zero determinant. It follows that Dn vanishes on the set 
Z{Ao , . . . , An) from Theorem (6.2). Since the resultant is the irreducible 
defining equation of this set, it must divide Dn- (This argument is similar 
to one used frequently in Chapter 3.) 

To show that Dn is nonzero, we must find /o, . . . , /n for which Dn ^ 0. 
For this purpose, introduce a new variable t and let 

(6.13) fi= 

aEAi 

where the h E are the vectors used in the construction of the coherent 
mixed subdivision of Q = Qo 4- * • + Qn- Section 4 of [CEl] shows that 
Dn ^ 0 for this choice of the fi. We should also mention that without 
coherence, it can happen that Dn is identically zero. See Exercise 10 at the 
end of the section for an example. 

Finally, we compute the degree of Dn as a polynomial in the coefficients 
of fn- In (6.10), the coefficients of fn appear in the equations coming from 
Sn, SO that Dn has degree \Sn\ in these coefficients. So we need only prove 

(6.14) |5n| = MU,(go,...,Qn-i). 

If a E Sn, the word smallest in (6.8) means that a E R 6, where R = 
Fo Fn and dim{Fi) > 0 for z = 0, . . . , n — 1. Since the dimensions 

of the Fi sum to n, we must have dim(Fo) = • • • = dim(Fn_i) = 1. Thus 
ii is a mixed cell with Fn as the unique vertex in the sum. Conversely, any 
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mixed cell of the subdivision must have exactly one Fi which is a vertex 
(since the dim{Fi) < 1 add up to n). Thus, if is a mixed cell where Fn is 
a vertex, then fl (i? + ^) C Sn follows from (6.8). This gives the formula 

|5„|= |Z"n(il + ^)|, 

Fn is a vertex 

where the sum is over all mixed cells R = Fq-\- • — h Fn of the subdivision 
of Q for which Fn is a vertex. 

We now use two nice facts. First, the mixed cells R where Fn is a vertex 
are translates of the mixed cells in a mixed subdivision of Qo + • * • + Qn-i- 
Furthermore, Lemma 5.3 of [Emil] implies is that all mixed cells in this 
subdivision of Qo + * * * + Qn-i appear in this way. Since translation doesn’t 
affect volume. Theorem (6.7) then implies 

MVniQo, Qn-i) = 

Fn is a vertex 

where we sum over the same mixed cells as before. The second nice fact is 
that each of these cells F is a Minkowski sum of edges (up to translation 
by the vertex Fn), so that by Section 5 of [CEl], the volume of R is the 
number of lattice points in a generic small translation. This means 

Yo\n{R) = \Z^n{R + 6)l 

and (6.14) now follows immediately. □ 

This shows that Dn has the desired properties. Furthermore, we get 
other determinants Dq^ , £>n-i by changing how we choose the subsets 
Si C E. For instance, if we replace smallest by largest in (6.8), then we get a 
determinant Dq whose degree in the coefficients of /o is MVn{Qi, . . . , Qn)- 
More generally, for each j between 0 and n, we can find a determinant 
Dj which is a nonzero multiple of the resultant and whose degree in the 
coefficients of fj is the mixed volume 

^Yn{Ql, • • • J Qj — 1, • • • ) Qn) 

(see Exercise 11 below). Using Theorem (6.3) of this section and the 
argument of Proposition (4.7) of Chapter 3, we conclude that 

^^^Ao,...,AnifOi • - ,fn) = ±GCD(Do, . . . , Dn)- 

As in Chapter 3, the GCD computation needs to be done for fo, - - fn 
with symbolic coefficients. 

In practice, this method for computing the sparse resultant is not very 
useful, mainly because the Dj tend to be enormous polynomials when the 
fi have symbolic coefficients. But if we use numerical coefficients for the 
fi, the GCD computation doesn’t make sense. Two methods for avoiding 
this difficulty are explained in Section 5 of [CElj. Fortunately, for many 
purposes, it suffices to work with just one of the Dj (we will give an example 
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below), and Dj can be computed by the methods discussed at the end of 
§4 of Chapter 3. 

A different but closely related approach to computing resultants is de- 
scribed in [EC]. The major advance of [EC] over [CEl] is that it uses smaller 
determinants to compute resultants. In a different vein, some theoretical 
formulas for resultants can be found in [GKZ] . We should also mention that 
[Stu3] studies the combinatorics of the mixed sparse resultant. One of the 
major unsolved problems concerning sparse resultants is whether they can 
be represented as a quotient of two determinants. In the multipolynomial 
case, this is true by Theorem (4.9) of Chapter 3. Does this have a sparse 
analog? Nobody knows! 

We will end this section with a brief discussion (omitting most proofs) 
of how sparse resultants can be used to solve equations. The basic idea is 
that given Laurent polynomials fi G L{Ai)^ we want to solve the equations 

(6.15) • • • 5 ^n) = • • • = /n(^l5 • • • 5 ^n) “ 0* 

If we assume that the fi are generic, then by Bernstein’s Theorem from §5, 
the number of solutions in (C*)’^ is the mixed volume MV"n(Qi, . . . , Qn)-, 
where Qi = Conv(A)- 

To solve (6.15), we can use sparse resultants in a variety of ways, similar 
to what we did in the multipolynomial case studied in Chapter 3. We begin 
with a sparse version of the u-resultant from §5 of Chapter 3. Let 

h = Uq-\- UiXi H h UnXn, 

where uq , are variables. The Newton polytope of /o is Qo = 

Conv(^), where = {0? ^i, . . . , Cn} and ei, . . . , are the usual stan- 
dard basis vectors. Then the u-resultant of /i,...,/n is the resultant 
Res^o^...,^^(/o , . . . ^ fn)i which written out more fully is 

(lAO 4“ UlXi -{-•••+ UjiXn, /l, • • • 5 /n)- 

For /i, . . . , /n generic, one can show that there is a nonzero constant C 
such that 

(6.16) Res^o,...,^„(/o,...,/n) — C JJ /o(p)- 

peV(/i,...,/„)n(C*)" 

This generalizes Theorem (5.8) of Chapter 3 and is proved using a sparse 
analog (due to Pedersen and Sturmfels [PS2]) of Theorem (3.4) from 
Chapter 3. If p = (ai, . . . , a^) is a solution of (6.15) in (C*)*^, then 

/o(p) = no + niai H h Unan, 

so that factoring the u-resultant gives the solutions of (6.15) in (C*)’^. 

In (6.16), generic means that the solutions all have multiplicity 1. If some 
of the multiplicities are > 1, the methods of Chapter 4 can be adapted to 
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show that 

• • • ) /n) — ^ n 

pev(/i,...,/n)n(c*)- 

where m(p) is the multiplicity of p as defined in §2 of Chapter 4. 

Many of the comments about ix-resultant from §5 of Chapter 3 carry over 
without change to the sparse case. In particular, we saw in Chapter 3 that 
for many purposes, we can replace the sparse resultant with the determi- 
nant Dq. This is true in the sparse case, provided we use Dq as defined in 
this section. Thus, (6.16) holds using Dq in place of the sparse resultant, 
i.e., there is a constant C' such that 

Do= c n 

This formula is reasonable since Dq, when regarded as a polynomial in the 
coefficients uq, , . . ,Un of /o, has degree MVn{Qi , . • . , Qn)j which is the 
number of solutions of (6.15) in (C*)’^. There is a similar formula when 
some of the solutions have multiplicities > 1. 

We can also find solutions of (6.15) using the eigenvalue and eigenvector 
techniques discussed in §6 of Chapter 3. To see how this works, we start 
with the ring of all Laurent polynomials. The Laurent 

polynomials in our equations (6.15) give the ideal 

(/i, •••,/«> c C[xf\...,x^^]. 

We want to find a basis for the quotient ring C[xf , x^^]/{fi, . . . , fn)- 
For this purpose, consider a coherent mixed subdivision of the Minkowski 
sum Qi + • • • + Qn- If we combine Theorem (6.7) and the proof of 
Theorem (6.12), we see that if 8 is generic, then 

MVniQu . . . , Q„) = XI 1^" + ^)l- 

R 

where the sum is over all mixed cells in the mixed subdivision. Thus the 
set of exponents 

S = {P e : l3 e R 6 ioT some mixed cell R} 

has MVniQi , . . . , Qn) elements. This set gives the desired basis of our 
quotient ring. 

(6.17) Theorem. For the set £ described above, the cosets [x^] for P E £ 
form a basis of the quotient ring C[a;f . . . , x^^]/ (/i, • • • , /n)- 

Proof. This was proved independently in [ER] and [PSl]. In the termi- 
nology of [PSl], the cosets [x^] for P E £ form a mixed monomial basis 
since they come from the mixed cells of a mixed subdivision. 

We will prove this in the following special case. Consider /o = + 

u\Xi + • • • + UnXn, and let Aq and Qo be as above. Then pick a coherent 
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mixed subdivision of Q = Qo + Qi + • ' * + Qn and let S = Z'^ f] {Q S) . 
Also define Si C 8 using (6.8) with smallest replaced by largest. Using the 
first “nice fact” used in the proof of Theorem (6.12), one can show that the 
coherent mixed subdivision of Q induces a coherent mixed subdivision of 
Qi + • • • + Qn- We will show that the theorem holds for the set 8 coming 
from this subdivision. 

The first step in the proof is to show that 

(6.18) a e So a = v{a) + /? for some v{a) G Ao and /3 e 8. 

This follows from the arguments used in the proof of Theorem (6.12). Now 
let Mo be the coefficient matrix of the equations (6.10). These equations 
begin with 

(x“/x"(“))/o = 0, a G So, 
which, using (6.18), can be rewritten as 

(6.19) x^/o = 0, 

From here, we will follow the proof of Theorem (6.2) of Chapter 3. We 
partition Mq so that the rows and columns of Mo corresponding to elements 
of So lie in the upper left hand corner, so that 

/ Moo Moi ^ 

MnJ- 

By Lemma 4.4 of [Emil], Mu is invertible for generic /i, . . . , /n since we 
are working with a coherent mixed subdivision — the argument is similar to 
showing Do 7 *^ 0 in the proof of Theorem (6.12). 

Now let 8 = {/3i, . . . , /3^}, where /i = MVn{Qiy . . . , Qn)- Then, for 
generic /i, . . . , /n, we define the x fi matrix 

(6.20) M = Mqo ~ MoiMjj^MiQ. 

Also, for p G V(/i, . . . , fn) n (C*)^, let denote the column vector 




Similar to (6.6) in Chapter 3, one can prove 

= /o(p)p^ 

because (6.19) gives the rows of Mq coming from 5 q. 

The final step is to show that the cosets . . . , [x^^] are linearly 

independent. The argument is identical to what we did in Theorem (6.2) 
of Chapter 2. □ 
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Using the mixed monomial basis, the next step is to find the matrix of 
the multiplication map m/^ : A ^ A, where 

A = C[xf ^ X^^]/ if i, fn) 

and mf^dg]) = [fog] for [g] G A. As in Chapter 3, this follows immediately 
from the previous result. 

(6.21) Theorem. Let fi G L{Ai) be generic Laurent polynomials, and 
let /o = uo + u\Xi + •••-!- UnXn- Using the basis from Theorem (6,17), 
the matrix of the multiplication map : A ^ A defined above is the 
transpose of the matrix 

M = Moo ~ MqiMii Mio 

from (6.20). 

If we write M in the form 

M Uo I ^ U\ Ml + • • • + Uji Mfii 

where each Mi has constant entries, then Theorem (6.21) implies that for 
all i, {Mi)^ is the matrix of multiplication by Xi. Thus, as in Chapter 3, 
M simultaneously computes the matrices of the multiplication maps by all 
of the variables xi, ... ,Xn- 

Now that we have these multiplication maps, the methods mentioned in 
Chapters 2 and 3 apply with little change. More detailed discussions of how 
to solve equations using resultants, including some substantial examples, 
can be found in [Emil], [Emi2], [ER], [Manl], and [Roj4]. 

We should mention that other techniques introduced in Chapter 3 can 
be adapted to the sparse case. For example, the generalized characteristic 
polynomial (GCP) from §6 of Chapter 3 can be generalized to the toric 
GCP defined in [Roj2]. This is useful for dealing with the types of degen- 
eracies discussed in Chapter 3. In a similar vein, there is a refinement of 
the u-resultant used in (6.16), called the twisted Chow form [Roj2]. As ex- 
plained in Section 2.3 of [Roj2], there are situations where the u-resultant 
vanishes identically yet the twisted Chow form is nonzero. 



Additional Exercises for §6 

Exercise 9. Consider the following system of equations taken from [Stu3]: 

0 = fo = ax ^ by 
0 = f I = cx A dy 
0 = f 2 = ex A fy + g. 

a. Explain why the hypothesis of Theorem (6.2) is not satisfied. Hint: Look 
at the Newton polytopes. 
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b. Show that the sparse resultant exists and is given by Res(/o, /i, /2) = 
ad — be. 

Exercise 10. In Exercise 7, we defined the decomposition S = S0US1US2 
using coherent Minkowski sums R = Fq + Fi +F2. This exercise will explore 
what can go wrong if we don’t use coherent sums. 

a. Exercise 7 gave the coherent Minkowski sum R2 — edge of Qo + (O5 1) + 
edge of Q2- Show that R2 = (0, 1) H- edge of Qi +edge of Q2 also holds. 

b. If we use coherent Minkowski sums for Ri when i ^ 2 and the non- 
coherent one from part a when i = 2, show that (6.8) gives So = 
{x^y,x^y^,x^y,xy^,xy^}, 5i = 0 and S2 = {xy}. 

c. If we compute the determinant D2 using So, Si, S2 as in part b, show 
that D2 does not involve the coefficients of fi and conclude that D2 is 
identically zero in this case. Hint: You don’t need explicit computations. 
Argue instead that D2 is divisible by Resi,i,2- 

Exercise 11. This exercise will discuss the determinant Dj for j < n. 
The index j will be fixed throughout the exercise. Given £ as usual, define 
the subset Si d £ to consist of ell a d £ such that li a G R 8 , where 
R = Fo + ‘ Fn is coherent, then 

ifdim(F^) >0VA:^ j 
\ min(fc ^ j : Fk is a vertex) otherwise. 

By adapting the proof of Theorem (6.12), explain why this gives a de- 
terminant Dj which is a nonzero multiple of the resultant and whose 
degree as a polynomial in the coefficients of fj is the mixed volume 

• 5 Qj—i, Qj^i, . . . , Qn)> 

Exercise 12. Prove that as polynomials with integer coefficients, we have 

Res^o,...,>tn(/o, • • • , /n) = ±GCD(Fo, . . . , Dn). 

Hint: Since Dj and Res^o,...,^^ have the same degrees when regarded as 
polynomials in the coefficients of fj, it is relatively easy to prove this over Q. 
To prove that it is true over Z, it suffices to show that the coefficients of each 
Dj are relatively prime. To prove this for j = n, consider the polynomials 
fi defined in (6.13) and use the argument of Section 4 of [CEl] (or, for a 
more detailed account. Section 5 of [CE2]) to show that D^ has a leading 
coefficient 1 as a polynomial in t. 

Exercise 13. Compute the mixed sparse resultant of the polynomials 
/o = «i + o,2xy + asx‘^y + a^x 
fi = hy + b 2 x‘^y‘^ + bzx^y + h^x 

/2 = Cl + C2j/ + C3XJ/ + C4X. 
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Hint: To obtain a coherent mixed subdivision, let Iq = (L, L^), l\ = 
— (I/^, 1) and I 2 = (1, — T), where L is a sufficiently large positive inte- 
ger. Also let 6 = —(3/8, 1/8). The full details of this example, including 
the explicit matrix giving Dq, can be found in [CEl]. 




Chapter 8 

Integer Programming, 
Combinatorics, and Splines 



In this chapter we will consider a series of interrelated topics concerning 
polyhedral regions P in the integer points in these regions, and piece- 
wise polynomial functions on subdivisions of such regions. In each case 
Grobner basis techniques give important computational and conceptual 
tools to deal with problems of interest and practical importance. These 
topics are also closely related to the material on polytopes and toric vari- 
eties from Chapter 7, but we have tried to make this chapter as independent 
as possible from Chapter 7 so that it can be read separately. 



§1 Integer Programming 

This section applies the theory of Grobner bases to problems in integer 
programming. Most of the results depend only on the basic algebra of poly- 
nomial rings and facts about Grobner bases for ideals. Prom Proposition 
(1.12) on, we will also need to use the language of Laurent polynomials, 
but the idea should be reasonably clear even if that concept is not familiar. 
The original reference for this topic is an article by Conti and Traverso, 
[CT], and another treatment may be found in [AL], section 2.8. Further 
developments may be found in the article [Tho] and the book [Stu2]. For 
a general introduction to linear and integer programming, we recommend 
[Schri]. 

To begin, we will consider a very small, but in other ways typical, applied 
integer programming problem, and we will use this example to illustrate the 
key features of this class of problems. Suppose that a small local trucking 
firm has two customers, A and B, that generate shipments to the same 
location. Each shipment from A is a pallet weighing exactly 400 kilos and 
taking up 2 cubic meters of volume. Each pallet from B weighs 500 kilos 
and and takes up 3 cubic meters. The shipping firm uses small trucks that 
can carry any load up to 3700 kilos, and up to 20 cubic meters. B’s product 
is more perishable, though, and they are willing to pay a higher price for 
on-time delivery: $ 15 per pallet versus $ 11 per pallet from A. The question 
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facing the manager of the trucking company is: How many pallets from each 
of the two companies should be included in each truckload to maximize the 
revenues generated? 

Using A to represent the number of pallets from company A, and simi- 
larly B to represent the number of pallets from company B in a truckload, 
we want to maximize the revenue function 11 A 15B subject to the 
following constraints: 

4A + 5B < 37 (the weight limit, in lOO’s) 

(1.1) 2A-\-W< 20 (the volume limit) 

A^B E Z>o. 

Note that both A, B must be integers. This is, as we will see, an important 
restriction, and the characteristic feature of integer programming problems. 

Integer programming problems are generalizations of the mathemati- 
cal translation of the question above. Namely, in an integer programming 
problem we seek the maximum or minimum value of some linear function 

£(Ai, . . . , An) = C\Ai + C 2 A 2 -f • • • + CnAn 

on the set of (Ai, . . . , An) G Z>q with Aj > 0 for all 1 < j < n satisfying 
a set of linear inequalities: 

aiiAi -h ai 2 A 2 H h < (or >) bi 

^2\Ai + U 22 A 2 ■+■•••■+■ 0>2nAn ^ (or ^) ^2 

(^mlA\ “h Uni2A2 "h * ' * “h O'mn-^n ^ (^^ ^) bni> 

We assume in addition that the and the bi are all integers. Some of the 
coefficients cj , aij , bi may be negative, but we will always assume Aj > 0 
for all j. 

Integer programming problems occur in many contexts in engineering, 
computer science, operations research, and pure mathematics. With large 
numbers of variables and constraints, they can be difficult to solve. It is 
perhaps instructive to consider our small shipping problem (1.1) in detail. 
In geometric terms we are seeking a maximum for the function 11 A -h 15B 
on the integer points in the closed convex polygon P in bounded above 
by portions of the lines 4A + 5B = 37 (slope —4/5), 2A + 3B = 20 (slope 
—2/3), and by the coordinate axes A = 0, and B = 0. See Fig. 8.1. The 
set of all points in R^ satisfying the inequalities from (1.1) is known as the 
feasible region. 

(1.2) Definition. The feasible region of an integer programming problem 
is the set P of all (Ai, . . . , A^) G R’^ satisfying the inequalities in the 
statement of the problem. 
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Figure 8.1. The feasible region P for (1.1) 



For readers of Chapter 7, we note that the inequalities in an integer 
programming problem can be written the same form as those in (1.4) of 
that chapter. If the feasible region of an integer programming problem 
is a bounded set in then it is a polytope. But other, more general, 
unbounded polyhedral regions also occur in this context. 

It is possible for the feasible region of an integer programming problem to 
contain no integer points at all. There are no solutions of the optimization 
problem in that case. For instance in R^ consider the region defined by 

A-i-B <1 

(1.3) 3A- B >1 

2A- B <1, 



and A^B > 0. 

Exercise 1. Verify directly (for example with a picture) that there are no 
integer points in the region defined by (1.3). 

When n is small, it is often possible to analyze the feasible set of an inte- 
ger programming problem geometrically and determine the integer points 
in it. However, even this can be complicated since any polyhedral region 
formed by intersecting half-spaces bounded by affine hyperplanes with 
equations defined over Z can occur. For example, consider the set P in 
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2A\ + 2A2 + 2A^ 


< 5 


—2Ai 2A2 “b 2A^ 


< 5 


2Ai + 2^12 ~ 2^13 


< 5 


—2Ai -f- 2^2 — 2^3 


< 5 


2A1 — 2A2 2^43 


< 5 


—2Ai — 2^2 “h 2A3 


< 5 


2A1 — 2A2 — 2A^ 


< 5 


—2Ai — 2A2 — 2A3 


< 5. 



In Exercise 11, you will show that P is a solid regular octahedron, with 8 
triangular faces, 12 edges, and 6 vertices. 

Returning to the problem from (1.1), if we did not have the additional 
constraints A, B G Z, (if we were trying to solve a linear programming 
problem rather than an integer programming problem) , the situation would 
be somewhat easier to analyze. For instance, to solve (1.1), we could apply 
the following simple geometric reasoning. The level curves of the revenue 
function i{A,B) = llA + 15P are lines of slope —11/15. The values of 
£ increase as we move out into the first quadrant. Since the slopes satisfy 
—4/5 < —11/15 < —2/3, it is clear that the revenue function attains its 
overall maximum on P at the vertex q in the interior of the first quadrant. 
Readers of Chapter 7 will recognize q as the face of P in the support line 
with normal vector u = (—11, —15). See Fig. 8.2. 




Figure 8 . 2 . The linear programming maximum for ( 1 . 1 ) 
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That point has rational, but not integer coordinates: q — (11/2,3). 
Hence q is not the solution of the integer programming problem! Instead, 
we need to consider only the integer points {A, B) in P, One ad hoc method 
that works here is to fix A, compute the largest B such that {A, B) lies in 
P, then compute the revenue function at those points and compare values 
for all possible A values. For instance, with A = 4, the largest B giving a 
point in P is P = 4, and we obtain ^(4, 4) = 104. Similarly, with ^ = 8, the 
largest feasible P is P = 1, and we obtain ^(8, 1) = 103. Note incidentally 
that both of these values are larger than the value of £ at the integer point 
closest to q in P — {A, B) == (5, 3), where ^(5, 3) = 100. This shows some 
of the potential subtlety of integer programming problems. Continuing in 
this way it can be shown that the maximum of t occurs at {A^ B) — (4, 4). 

Exercise 2. Verify directly (that is, by enumerating integer points as sug- 
gested above) that the solution of the shipping problem (1.1) is the point 
(AS) = (4,4). 

This sort of approach would be quite impractical for larger problems. 
Indeed, the general integer programming problem is known to be NP- 
complete, and so as Conti and Traverso remark, “even algorithms with 
theoretically bad worst case and average complexity can be useful ... , 
hence deserve investigation.” 

To discuss integer programming problems in general it will be helpful 
to standardize their statement to some extent. This can be done using the 
following observations. 

1. We need only consider the problem of minimizing the linear function 
£{A \^ . . . , An) — c\Ai -h C 2 A 2 + • • • H- CnAn, sincc maximizing a function 
^ on a set of integer n-tuples is the same as minimizing the function —£. 

2. Similarly, by replacing an inequality 

OiiAi + • • • 4 - OinAn ^ bi 

by the equivalent form 

(^ilA\ * ‘ * ^inAn ^ 

we may consider only inequalities involving <. 

3. Finally, by introducing additional variables, we can rewrite the linear 
constraint inequalities as equalities. The new variables are called “slack 
variables.” 

For example, using the idea in point 3 here the inequality 
3.4-1 — A 2 4“ 2A3 ^ 9 



can be replaced by 



3Ai — A 2 4" 2 A 3 4“ A 4 — 9 
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if A4 = 9 — (3i4i — ^2 + 2 A3) > 0 is introduced as a new variable to “take 
up the slack” in the original inequality. Slack variables will appear with 
coefficient zero in the function to be minimized. 

Applying 1, 2, and 3 above, any integer programming problem can be 
put into the standard form: 

Minimize: ciAi + • • • + CnAn, subject to: 

-f- ^12^2 + •••-}- airiAfi = bi 
(^21 Ai + 022^2 + • • • + a2nAn = &2 

(1.4) 



^mlAi -J- ^7112^2 + • • • + amnAfi — bm 
Aj G Z>o, j 1 , . . . ?2, 

where now n is the total number of variables (including slack variables). 
As before, we will call the set of all real n-tuples satisfying the constraint 
equations the feasible region. 

For the rest of this section we will explore an alternative approach to 
integer programming problems, in which we translate such a problem into 
a question about polynomials. We will use the standard form (1.4) and first 
consider the case where all the coefficients are nonnegative: aij > 0, 6^ > 0. 
The translation proceeds as follows. We introduce an indeterminate Zi for 
each of the equations in (1.4), and exponentiate to obtain an equality 

Oil Ai4-ai2^2H VCLin-^n 

for each i = 1, . . . , m. Multiplying the left and right hand sides of these 
equations, and rearranging the exponents, we get another equality: 

n m m 

(1-5) 11(11 =ri4‘- 

j — l 2=1 2=1 

From (1.5) we get the following direct algebraic characterization of the 
integer n-tuples in the feasible region of the problem (1.4). 

(1.6) Proposition. Let k be a fields and define (p : k[wi^ . . . ^Wn] 
k[zi,...,Zm] by setting 

m 

= n 

2=1 

for each j = and <p{g{wi, tu„)) = g{(p{wi), <p{wn)) for 

a general polynomial g € k[wi , . . . , Then (Ai, . . . , An) is an in- 
teger point in the feasible region if and only if p maps the monomial 
W 2 ^ • • • to the monomial • • • z ^ . 

Exercise 3. Prove Proposition (1.6). 
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For example, consider the standard form of our shipping problem (1.1), 
with slack variables C in the first equation and D in the second. 

if : k[wi,W2,wz,Wi] k[zi,Z2] 

4 2 

W\ 1—^ Z{Z2 

(1.7) W 2 ^ z\zl 

W4 l-> Z2. 

The integer points in the feasible region of this restatement of the problem 
are the {A, B^C,D) such that 

Exercise 4. Show that in this case every monomial in ^[ 2 : 1 , . . . , Zm] is the 
image of some monomial in k[wi , . . . , Wn]- 

In other cases, (p may not be surjective, and the following test for mem- 
bership in the image of a mapping is an important part of the translation 
of integer programming problems. 

Since the image of ip in Proposition (1.6) is precisely the set of poly- 
nomials in k[zi^...,Zm] that can be expressed as polynomials in the 
fj = nZi z^^^ , we can also write the image as A;[/i, . . . , /n], the sub- 
ring of k[zi, , Zjn] generated by the fj. The subring membership test 
given by parts a and b of the following Proposition is also used in studying 
rings of invariants for finite matrix groups (see [CLO], Chapter 7, §3). 

(1.8) Proposition. Suppose that /i, . . . , /n ^ k[zi , . . . , Zm] Q'f'e given. 
Fix a monomial order in . . . , Zm, ^ 1 , • . . , Wn] the elimination 
property: any monomial containing one of the Zi is greater than any 
monomial containing only the Wj . Let Q he a Grobner basis for the ideal 

I = {fl - Wi, ..., fn - Wn) c k[zi, . . . ,Zm,Wi,. . . , Wn] 

and for each f G fc[ 2 ;i, . . . , Zm], l^t f^ be the remainder on division of f 
by Q. Then 

a. A polynomial f satisfies f e fc[/i, . . . , /n] if and only if g = f^ G 
k[wi, . . .,Wn]. 

h. If f G fc[/i, . . . , /n] and g = f G k[wi , . . . , Wn] as in part a, then 
f = difij • • • ? fn)j giving an expression for f as a polynomial in the fj. 
c. If each fj and f are monomials and f G fc[/i, . . . , /n], then g is also a 
monomial. 

In other words, part c says that in the situation of Proposition (1.6), if 
i*s i^ image of (^, then it is automatically the image of some 
monomial . 
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Proof. Parts a and b are proved in Proposition 7 of Chapter 7, §3 in 
[CLO], so we will not repeat them here. 

To prove c, we note that each generator of / is a difference of two 
monomials. It follows that in the application of Buchberger’s algorithm to 
compute Q, each S-polynomial considered and each nonzero iS-polynomial 
remainder that goes into the Grobner basis will be a difference of two 
monomials. This is true since in computing the AS-polynomial, we are sub- 
tracting one difference of two monomials from another, and the leading 
terms cancel. Similarly, in the remainder calculation, at each step we sub- 
tract one difference of two monomials from another and cancellation occurs. 
It follows that every element of Q will also be a difference of two mono- 
mials. When we divide a monomial by a Grobner basis of this form, the 
remainder must be a monomial^ since at each step we subtract a differ- 
ence of two monomials from a single monomial and a cancellation occurs. 
Hence, if we are in the situation of parts a and b and the remainder is 
g{wi , . . . , Wn) G k[wi , . . . , Wn], then g must be a monomial. □ 

In the restatement of our example problem in (1.7), we would consider 
the ideal 



I = {zfzl - Wi,z\zl - W2, Zi - W3, Z2 ~ W4) . 

Using the lex order with the variables ordered 

Zi > Z2 > W4 > Ws > W2 > Wi 

(chosen to eliminate terms involving slack variables if possible), we obtain 
a Grobner basis Q: 



gi = Zi- ws, 
g2 = Z2- W 4 , 
gs = wlws - wi 

(1.9) 54 = Wiw\w2 - w\ 

§5 = W4W3W1 - W2 

ge = Wiwj - W3W2 
gj = wlwl - wf. 

(Note: An efficient implementation of Buchberger’s algorithm is neces- 
sary for working out relatively large explicit examples using this approach, 
because of the large number of variables involved. We used Singular and 
Macaulay to compute the examples in this chapter.) So for instance, using 
gi and g 2 the monomial / = reduces to Hence / is in the 

image of ip from (1.7). But then further reductions are also possible, and 
the remainder on division is 

4 4 

/ = W2W1W3, 
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This monomial corresponds to the solution of the integer programming 
problem (A = 4, B = 4, and slack C = 1) that you verified in Exercise 
2. In a sense, this is an accident, since the lex order that we used for 
the Grobner basis and remainder computations did not take the revenue 
function i explicitly into account. 

To find the solution of an integer programming problem minimizing a 
given linear function . . . , A^) we will usually need to use a monomial 
order specifically tailored to the problem at hand. 

(1.10) Definition. A monomial order on k[zi , . . . , Zm, tt;i, . . . , Wn] is said 
to be adapted to an integer programming problem (1.4) if it has the 
following two properties: 

a. (Elimination) Any monomial containing one of the zi is greater than 
any monomial containing only the Wj. 

b. (Compatibility with £) Let A = (Ai, . . . , An) and A' = {A [, . . . , A^). 

If the monomials satisfy ip{w^) = ip{w^ ) and ^(Ai, . . . , An) > 

^(A'l, . . . , A^), then > w^' . 

(1.11) Theorem. Consider an integer programming problem in standard 
form (L4)- Assume all h > 0 and let fj = YYiLi as before. Let Q 
be a Grobner basis for 

I = {fl - Wi, fn - Wn) C fc[zi, ...,Zm,Wi,..., Wn] 

with respect to any adapted monomial order. Then if f = z\^ - • • z^ is 

in fc[/i, . . . , /n], the remainder G k[wi , . . . , Wn] will give a solution of 
(1.4) minimizing £. (There are cases where the minimum is not unique and, 
if so, this method will only find one minimum.) 

Proof. Let ^ be a Grobner basis for I with respect to an adapted 
monomial order. Suppose that so (p{w^) = /, but that 

A = (Ai , ..., An) is not a minimum of £. That is, assume that there is 
some A' = {A [, . . . , A'^) ^ A such that ^p{w^') = f and £{A \, . . . , A^) < 
£(Ai, . . . , An). Consider the diflFerence h = — w^' . We have ^{h) = 

/ — / = 0. In Exercise 5 below, you will show that this implies h E I. 
But then h must reduce to zero under the Grobner basis G for I. However, 
because > is an adapted order, the leading term of h must be and 
that monomial is reduced with respect to G since it is a remainder. This 
contradiction shows that A must give a minimum of £. □ 

Exercise 5. Let fi G k[zi , . . . , Zm]^ 2 = 1, . . . , n, as above and define a 
mapping 



if : k[wi, ...,Wn]^ k[zi , . . . , Zm] 

Wi !-)• fi 
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as in (1.6). Let I = (/i - , /n - Wn) C k[zi, . . . , . . . , Wn]- 

Show that if h e k[wi, . . . ,Wn] satisfies (p{h) = 0, then h E I C\ 
k[wi , . . . ^Wn]- Hint: See the proof of Proposition 3 from Chapter 7, §4 
of [CLO]. 

Exercise 6. Why did the lex order used to compute the Grobner basis in 
(1.9) correctly find the maximum value of 11^ + 15B in our example prob- 
lem (1.1)? Explain, using Theorem (1.11). (Recall, W 4 and ws corresponding 
to the slack variables were taken greater than W 2 ,wi.) 

Theorem (1.11) yields a Grobner basis algorithm for solving integer 
programming problems with all aij, bi > 0: 

Input: A, b from (1.4), an adapted monomial order > 

Output: a solution of (1.4), if one exists 



fj 

i=l 

I := (/i - Wi, . . . , fn - Wn) 

Q := Grobner basis of I with respect to > 

m 

i=l 

-pQ 

9 •= f 

IF g E k[wi , . . . , Wn] THEN 

its exponent vector gives a solution 
ELSE 

there is no solution 

Monomial orders satisfying both the elimination and compatibility 
properties from (1.10) can be specified in the following ways. 

First, assume that all cj >0. Then it is possible to define a weight order 
on the It;- variables using the linear function i (see [CLO], Chapter 2, 
§4, Exercise 12). Namely order monomials in the it;- variables alone first by 
^-values: 

• • • '^n "" >(- ^ 

if ^(Ai, . . . , An) > ^(^i, . . . , A'^) and break ties using any other fixed 
monomial order on fc[it;i, . . . , Wn]- Then incorporate this order into a prod- 
uct order on k[zi^ . . . , it;i, . . . , Wn] with the z- variables greater than all 
the It;- variables, to ensure that the elimination property from (1.10) holds. 
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If some Cj < 0, then the recipe above produces a total ordering on 
monomials in k[zi, . . . ^ Zm,'^u • • • 'UJn] that is compatible with multipli- 
cation and that satisfies the elimination property. But it will not be a 
well-ordering. So in order to apply the theory of Grobner bases with re- 
spect to monomial orders, we will need to be more clever in this case. We 
begin with the following observation. 

In k[zi, . . . ^ Zm,'Wi, . . . , Wn], define a (non-standard) degree for each 
variable by setting deg(2;i) = 1 for all z = l,...,m, and deg{wj) = 
dj = ^ij for all j = 1, • • • 5 Each dj must be strictly positive, 

since otherwise the constraint equations would not depend on Aj. We say 
a polynomial / G k[zi, . . . , , 'w^n] is homogeneous with respect 

to these degrees if all the monomials z^w^ appearing in / have the same 
(non-standard) total degree |a| + Ylj djPj. 

(1.12) Lemma. With respect to the degrees dj on Wj, the following 
statements hold. 

a. The ideal I = {fi — wi, fn — Wn) is homogeneous. 

b. Every reduced Grobner basis for the ideal I consists of homogeneous 
polynomials. 

Proof. Part a follows since the given generators are homogeneous for 
these degrees — since fj = 5 i^wo terms in fj — wj have the 

same degree. 

Part b follows in the same way as for ideals that are homogeneous in the 
usual sense. The proof of Theorem 2 of Chapter 8, §3 of [CLO] goes over 
to non-standard assignments of degrees as well. □ 

For instance, in the lex Grobner basis given in (1.9) above, it is easy to 
check that all the polynomials are homogeneous with respect to the degrees 
deg{zi) = 1, deg{wi) = 6, deg(i(;2) = 8, and deg(u;3) = deg('u;4) = 1. 

Since dj > 0 for all j, given the Cj from £ and /x > 0 sufficiently large, 
all the entries of the vector 

(ci, . . . , Cn) -f- ^(di, • . . , dfi) 

will be positive. Let p be any fixed number for which this is true. Consider 
the (m + n)-component weight vectors U\,U2: 

ui = ( 1 ,..., 1 , 0 ,..., 0 ) 

U2 (0, . . . , 0, Cl , . . . , C72) “f" /^(Oj • • • ) 0) ^1 5 • • • 5 dji^. 

Then all entries of U2 are nonnegative, and hence we can define a weight 
order >ui,u2,(t by comparing ui-weights first, then comparing i/2-weights if 
the wi-weights are equal, and finally breaking ties with any other monomial 
order 
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Exercise 7. Consider an integer programming problem (1.4) in which 
dij, bi > 0 for all i^j. 

a. Show that the order >ui,u 2 ,o- defined above satisfies the elimination 
condition from Definition (1.10). 

b. Show that if ^{w^) = then is homogeneous with 

respect to the degrees dj = deg{wj). 

c. Deduce that >m,n 2 ,cr is an adapted order. 

For example, our shipping problem (in standard form) can be solved 
using the second method here. We take ui = (1, 1,0, 0,0, 0), and letting 
/X = 2, we see that 

U2 = (0, 0, -11, -15, 0, 0) + 2(0, 0, 6, 8, 1, 1) = (0, 0, 1, 1, 2, 2) 

has all nonnegative entries. Finally, break ties with = graded reverse 
lex on all the variables ordered zi > Z 2 > w\ > W 2 > ws > w^. Here is a 
Singular session performing the Grobner basis and remainder calculations. 
Note the definition of the monomial order >ui,u 2 ,o- by means of weight 
vectors. 

> ring R = 0,(z(1..2),w(1..4)), (a(l , 1 ,0,0,0,0) , 
a(0,0,l, 1,2,2) ,dp) ; 

> ideal I = z(l) "4*z(2) "2“w(l) , z(l) "5*z(2) "3-w(2) , z(l)-w(3) , 
z(2)-w(4); 

> ideal J = std(I) ; 

> J; 

J[l]=w(l)*w(3)*w(4)-l*w(2) 

J [2] =w(2) '^2*w(3) ''2-l*w(D ^3 
J [3] =w(l) "4*w(4)-l*w(2) "3*w(3) 

J[4] =w(2)*w(3)'^3*w(4)-l*w(l)^2 
J [5] =w(3) ''4*w(4) ''2-l*w(D 
J[6]=z(2)-l*w(4) 

J[7] =z(l)-l*w(3) 

> poly f = z(l)"37*z(2)"20; 

> reduce (f ,J) ; 
w(l)'‘4*w(2)''4*w(3) 

We find 

= wfw2Ws 

as expected, giving the solution A = A, B = 4^ and C = 1, D = 0. 

Finally, we want to discuss general integer programming problems where 
some of the aij and bi may be negative. There is no real conceptual differ- 
ence in that case; the geometric interpretation of the integer programming 
problem is exactly the same, only the positions of the affine linear spaces 
bounding the feasible region change. But there is a difference in the alge- 
braic translation. Namely, we cannot view the negative aij and bi directly 
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as exponents — that is not legal in an ordinary polynomial. One way to fix 
this problem is to consider what are called Laurent polynomials in the vari- 
ables Zi instead — polynomial expressions in the Zi and z~^, as defined in 
Chapter 7, §1 of this text. To deal with these more general objects without 
introducing a whole new set of m variables, we will use the second repre- 
sentation of the ring of Laurent polynomials, as presented in Exercise 15 
of Chapter 7, §1: 

k[z^^, Z^] = k[zi, ...,Zm, t]/{tzi ■■■Zm-1)- 

In intuitive terms, this ismorphism works by introducing a single new vari- 
able t satisfying tzi • • • Zm — ^ = 0, so that formally t is the product of 
the inverses of the zi: t = z^^ • • • Then each of the nz j involved 
in the algebraic translation of the integer programming problem can be 

rewritten in the form IlHi ? where now all > 0 — we can just 
take ej > 0 to be the negative of the smallest (most negative) Oij that 
appears, and = aij — ej for each i. Similarly, flUi rewritten 

in the form Y\dLi with e > 0, and 6^ > 0 for all i. It follows that 
the equation (1.5) becomes an equation between polynomial expressions in 

t, Zi , . . . , Zji‘. 



n m m 

j=\ i=l i=l 

modulo the relation tzi • • • Zm — 1 = 0. We have a direct analogue of 
Proposition (1.6). 

(1.13) Proposition. Define a mapping 

(fi : k[wi, k[z^^, z^^] 



by setting 



^(wj) - t^^ z"l"^ mod {tzi • - Zjn-l) 

i=l 

for each j = l,...,n, and extending to general g{wi^...^Wn) G 

k[w \^ . . . , Wn] as before. Then . . . , An) is an integer point in the fea- 
sible region if and only if cp{wi^W2^ • • • w^^) and t^z\^ • • • z^ represent 
the same element in k[zf ^., . . . , z^^] (that is, their difference is divisible by 
tZi--Zm- 

Similarly, Proposition (1.8) goes over to this more general situation. 
We will write S for the image of ip in k[zf^, . . . , z^^]. Then we have the 
following version of the subring membership test. 
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(1.14) Proposition. Suppose that /i, . . . , /n C k[zi^ . . . , Zm,t] are given. 
Fix a monomial order in fc[zi, . . . , t(;i, . . . , Wn] with the elimination 

property: any monomial containing one of the zi or t is greater than any 
monomial containing only the Wj . Finally, let G he a Grobner basis for the 
ideal 



J = {tzi ■■■ Zm - 1, fl - Wl, f„ - Wn) 

in k[zi, . . . , Zm, t,wi,. . . , Wn] and for each f G k[zi, . . . , Zm, t], let be 
the remainder on division of f by G- Then 

a. / represents an element in S if and only if g = f C k[wi , . . . , Wn]- 

—G 

b. If f represents an element in S and g = f C k[wi, . . . , Wn] as in part 
a, then f = g{fi , . . . , /n); giving an expression for f as a polynomial 
in the fj. 

c. If each fj and f are monomials and f represents an element in S, then 
g is also a monomial. 

The proof is essentially the same as the proof for Proposition (1.8) so we 
omit it. 

We should also mention that there is a direct parallel of Theorem (1.11) 
saying that using monomial orders which have the elimination and com- 
patibility properties will yield minimum solutions for integer programming 
problems and giving an algorithm for their solution. For i with only nonneg- 
ative coefficients, adapted orders may be constructed using product orders 
as above, making t and the zi greater than any Wj. For a more general 
discussion of constructing monomial orders compatible with a given I, we 
refer the reader to [CT]. 

We will conclude this section with an example illustrating the general 
case described in the previous paragraph. Consider the following problem 
in standard form: 



(1.15) 



Minimize: 

A -h 10005 -h C 4- 1005, 
Subject to the constraints: 

3A - 25 + C = -1 
AA^B-C-D = h 
A, B,C,D G Z>Q. 



With the relation tz\Z 2 — 1 = 0, our ideal J in this case is 

J z=zJfziZ 2 — 1, z\z 2 — Wi,t^zl — W2, tz\ — Ws, tZ\ — W^). 

If we use an elimination order placing t,z\,Z 2 before the it;- variables, and 
then the use a weight order compatible with I on the Wj (breaking ties with 
graded reverse lex), then we obtain a Grobner basis G for J consisting of 
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the following polynomials: 

= W2wl - W4 

02 = Wiwl - wl 

03 = WiW2w\ - W3 

04 = wiwjwswl - 1 

05 = Z2- Wiwjwswj 

06 = Zl- WiW2w\ 

07 = t - W2W3W4. 

Prom the right-hand sides of the equations, we consider / = tzf- ^ 
remainder computation yields 

2 

J = W1W2W4, 

Since this is still a very small problem, it is easy to check by hand that the 
corresponding solution {A = I, B = 2, C = 0, D = 1) really does minimize 
£{A, B,C^D) = A 10005 + C + lOOD subject to the constraints. 

Exercise 8. Verify directly that the solution {A, B^C^ D) = (1, 2, 0, 1) of 
the integer programming problem (1.15) is correct. Hint: Show first that 
5 > 2 in any solution of the constraint equations. 

We should also remark that because of the special binomial form of the 
generators of the ideals in (1.11) and (1.13) and the simple polynomial 
remainder calculations involved here, there are a number of optimizations 
one could make in special-purpose Grobner basis linear programming soft- 
ware. See [CT] for more details and for some preliminary results on larger 
problems. 



Additional Exercises for §1 

Exercise 9. What happens if you apply the Grobner basis algorithm to 
any optimization problem on the polyhedral region in (1.3)? 

Note: For the computational portions of the following problems, you will 
need to have access to a Grobner basis package that allows you to specify 
mixed elimination-weight monomial orders as in the discussion follow- 
ing Theorem (1.11). The built-in Grobner basis routines in Mathematica 
and Maple are not flexible enough for this. Singular and Macaulay^ for 
instance, are this flexible. 




374 Chapter 8. Integer Programming, Combinatorics, and Splines 



Exercise 10. Apply the methods of the text to solve the following integer 
programming problems: 

a. 

Minimize: 2A + 3B + (7 + 5-D, subject to: 
3A-h2B-hC-hD = 10 
4A + B-i-C = 5 
A, B, C, D ^ ^>0- 
Verify that your solution is correct. 

b. Same as a, but with the right-hand sides of the constraint equations 
changed to 20, 14 respectively. How much of the computation needs to 
be redone? 

c. 

Maximize: 3A -f 4B -f 2(7, subject to: 

3A + 2iB -b (7 < 45 
A + 2H -b 3C < 21 
2A -f JB -b (7 < 18 
A, B^ C G ^>0* 

Also, describe the feasible region for this problem geometrically, and use 
that information to verify your solution. 

Exercise 11 . Consider the set P in defined by inequalities: 



2Ai -b 2A2 + 2A3 


< 5 


— 2 Ai -b 2A2 + 2A3 


< 5 


2Ai -b 2A2 “ 2A3 


< 5 


— 2Ai -b 2A2 — 2A3 


< 5 


2Ai — 2A2 + 2A3 


< 5 


— 2Ai — 2A2 + 2A3 


< 5 


2Ai — 2A2 — 2A3 


< 5 


- 2 Ai - 2A2 - 2A3 


< 5. 



Verify that P is a solid (regular) octahedron. (What are the vertices?) 

Exercise 12. 

a. Suppose we want to consider all the integer points in a polyhedral region 
P C as feasible, not just those with non-negative coordinates. How 
could the methods developed in the text be adapted to this more general 
situation? 

b. Apply your method from part a to the find the minimum of 2 Ai — A 2 -b 
A 3 on the integer points in the solid octahedron from Exercise 11. 



§2 Integer Programming and Combinatorics 

In this section we will study a beautiful application of commutative algebra 
and the ideas developed in §1 to combinatorial enumeration problems. For 
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those interested in exploring this rich subject farther, we recommend the 
marvelous book [Stal] by Stanley. Our main example is discussed there and 
far-reaching generalizations are developed using more advanced algebraic 
tools. There are also connections between the techniques we will develop 
here, invariant theory (see especially [Stul]), the theory of toric varieties 
([Ful]), and the geometry of polyhedra (see [Stu2]). The prerequisites for 
this section are the theory of Grobner bases for polynomial ideals, famil- 
iarity with quotient rings, and basic facts about Hilbert functions (see, e.g. 
Chapter 6, §4 of this book or Chapter 9, §3 of [CLO]). 

Most of this section will be devoted to the consideration of the following 
classical counting problem. Recall that a magic square is an n x n integer 
matrix M = (mij) with the property that the sum of the entries in each 
row and each column is the same. A famous 4x4 magic square appears in 
the well-known engraving Melancholia by Albrecht Diirer: 



16 


3 


2 


13 


5 


10 


11 


8 


9 


6 


7 


12 


4 


15 


14 


1 



The row and column sums in this array all equal 34. Although the extra 
condition that the are the distinct integers 1, 2, . . . , (as in Diirer ’s 
magic square) is often included, we will not make that part of the definition. 
Also, many familiar examples of magic squares have diagonal sums equal 
to the row and column sum and other interesting properties; we will not 
require that either. Our problem is this: 

(2.1) Problem. Given positive integers s, n, how many different n x n 
magic squares with mij > 0 for all i^j and row and column sum s are 
there? 

There are related questions from statistics and the design of experiments 
of practical as well as purely mathematical interest. In some small cases, 
the answer to (2.1) is easily derived. 

Exercise 1. Show that the number of 2 x 2 nonnegative integer magic 
squares with row and column sum s is precisely s + 1, for each s > 0. How 
are the squares with sum s > 1 related to those with s = 1? 

Exercise 2. Show that there are exactly six 3 x 3 magic squares with 
nonnegative integer entries and s = 1, twenty-one with s = 2, and fifty- 
five with s = 3. How many are there in each case if we require that the 
two diagonal sums also equal s? 

Our main goal in this section will be to develop a general way to attack 
this and similar counting problems where the objects to be counted can be 
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identified with the integer points in a polyhedral region in for some iV, 
so that we are in the same setting as in the integer programming problems 
from §1. We will take a somewhat ad hoc approach though, and use only as 
much general machinery as we need to answer our question (2.1) for small 
values of n. 

To see how (2.1) fits into this context, note that the entire set of n x n 
nonnegative integer magic squares M is the set of solutions in of 

a system of linear equations with integer coefficients. For instance, in the 
3x3 case, the conditions that all row and column sums are equal can be 
expressed as 5 independent equations on the entries of the matrix. Writing 

m = (mil, rni2, mi3, m2i, m22, ru23, msi, ^33)^, 

the matrix M = (m^j) is a magic square if and only if 

( 2 . 2 ) A^m = 0 , 

where is the 5x9 integer matrix 

/II 1 -1 -1 -1 0 0 0 \ 

1 1 1 0 0 0 -1 -1 -1 

(2.3) ^3=01 1-10 0-10 0 

1-10 1-10 1-10 
\1 0 -1 1 0-11 0-1/ 

and rriij > 0 for all z, j. Similarly, the n x n magic squares can be viewed 
as the solutions of a similar system Anrh = 0 for an integer matrix A^ 
with columns. 

Exercise 3. 

a. Show that the 3x3 nonnegative integer magic squares are exactly the 
solutions of the system of linear equations (2.2) with matrix A^ given 
in (2.3). 

b. What is the minimal number of linear equations needed to define the 
corresponding space of n x n magic squares? Describe an explicit way 
to produce a matrix An as above. 

There are three important differences between our situation here and the 
optimization problems considered in §1. First, there is no linear function 
to be optimized. Instead, we are mainly interested in understanding the 
structure of the entire set of integer points in the feasible region. Second, 
unlike the regions considered in the examples in §1, the feasible region in 
this case is unbounded, and there are infinitely many integer points. Finally, 
we have a homogeneous system of equations rather than an inhomogeneous 
system, so the points of interest are elements of the kernel of the matrix 
An- In the following, we will write 



Kn = kev{An) n 
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for the set of all nonnegative integer n x n magic squares. We begin with 
a few simple observations. 

(2.4) Proposition. For each n, 

a. Kn is closed under vector sums , and contains the zero vector. 

b. The set Cn of solutions of Anfa = 0 satisfying m G forms a 

convex polyhedral cone inM.^'^'^j with vertex at the origin. 

Proof. Statement a follows by linearity. For b, Cn is polyhedral since 
the defining equations are the linear equations Anrh = 0 and the linear 
inequalities > 0 G R. It is a cone since any positive real multiple of 
a point in Cn is also in Cn- Finally, it is convex since if fh and m' are two 
points in Cn, any linear combination x = rm + (1 — r)m' with r G [0, 1] 
also satisfies the equations AnX = 0 and has nonnegative entries, hence lies 
in Cn. □ 

A set M with a binary operation is said to be a monoid if the operation 
is associative and possesses an identity element in M. For example is 
a monoid under vector addition. In this language, part a of the proposition 
says that Kn is a submonoid of 

To understand the structure of the submonoid Kn, we will seek to find 
a minimal set of additive generators to serve as building blocks for all the 
elements of Kn- The appropriate notion is given by the following definition. 

(2.5) Definition. Let K be any submonoid of the additive monoid 

A finite subset H C K is said to be a Hilbert basis for K if it satisfies the 
following two conditions. 

a. For every k G K there exist hi G H and nonnegative integers Ci such 
that k = Cihi, and 

b. W is minimal with respect to inclusion. 

It is a general fact that Hilbert bases exist and are unique for all sub- 
monoids K c Z>Q. Instead of giving an existence proof, however, we will 
present a Grobner basis algorithm for finding the Hilbert basis for the sub- 
monoid K = ker(A) in for any integer matrix with N columns. (This 
comes from [Stul], §1.4.) As in §1, we translate our problem from the 
context of integer points to Laurent polynomials. Given an integer matrix 
A = {aij) with N columns and m rows say, we introduce an indeterminate 
Zi for each row, i = 1, . . . , m, and consider the ring of Laurent polynomials: 

, zt^] = k[zi, ...,2m, t]/{tZi • • • 2m - 1). 

(See §1 of this chapter and Exercise 15 of Chapter 7, §1.) Define a mapping 

( 2 . 6 ) Ip : k[vu...,VN,wi,...,WN] k[zf^ , . . . , z^^][wi, . . . ,wn] 
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as follows. First take 

m 

(2.7) V’(t'j) = Wj • 

i=l 

and = Wj for each j = 1, . . . , iV, then extend to polynomials in 

k[vi, . . . ,Vn,wi, . . . ^ wn] so as to make a ring homomorphism. 

The purpose of -0 is to detect elements of the kernel of A. 

(2.8) Proposition. A vector G ker(^) if and only —w^) = 0, 

that is if and only if is in the kernel of the homomorphism 0. 



Exercise 4. Prove Proposition (2.8). 

As in Exercise 5 of §1, we can write J = ker(0) as 
J = 7 n k[vi , . . . , Viv, u;i, . . . , Wn], 

where 

m 

I = ■ n - Vj : j = 1,...N) 

i=l 

in the ring k[zf^^ . . . , z^^][vi, . . . , vn, • • • , wn]- The following theorem 
of Sturmfels (Algorithm 1.4.5 of [Stul]) gives a way to find Hilbert bases. 



(2.9) Theorem. Let Q be a Grobner basis for I with respect to any elim- 
ination order > for which all Zi,t > Vjj and all vj > Wk- Let S be the 
subset of Q consisting of elements of the form v^ — for some a 
Then 



n = {a: -w^ e S} 

is the Hilbert basis for K. 

Proof. The idea of this proof is similar to that of Theorem (1.11) of this 
chapter. See [Stul] for a complete exposition. □ 



Here is a first example to illustrate Theorem (2.9). Consider the 
submonoid of Z>q given as K = ker(A) D Z>o? for 



A = 



-1 

-1 



0 

-2 



)■ 



To find a Hilbert basis for AT, we consider the ideal I generated by 
W1Z1Z2 - Vi, W2Z1Z2 - V2, wst - V3, w^zft^ - V4 



and ziZ 2 t — 1. Computing a Grobner basis Q with respect to an elimination 
order as in (2.9), we find only one is of the desired form: 



ViVs — WiWs 
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It follows that the Hilbert basis for K consists of a single element: H = 
{(1, 0, 1, 0)}. It is not difficult to verify from the form of the matrix A that 
every element in K is an integer multiple of this vector. Note that the 
size of the Hilbert basis is not the same as the dimension of the kernel of 
the matrix A as a linear mapping on In general, there is no connection 
between the size of the Hilbert basis for K = ker(A) fl and dim ker(A); 
the number of elements in the Hilbert basis can be either larger than, equal 
to, or smaller than the dimension of the kernel, depending on A. 

We will now use Theorem (2.9) to continue our work on the magic square 
enumeration problem. If we apply the method of the theorem to find the 
Hilbert basis for ker(A 3 ) fl (see equation (2.3) above) then we need 
to compute a Grobner basis for the ideal I generated by 



Vi - W 1 Z 1 Z 2 Z 4 Z 5 

Vs — Wszlz^zlz^t 
V 5 - WsZ 2 ZsZst 
Vj 



2 2j. 

WjZiZ^Z^t 



V2 



W2z\z2z\zst 



V 4 — W 4 Z 2 z\zlt 
V6 - WQZ2ZsZ4t 
Vs - WsZiZsZst 



Vq - WQZiZ3Z4t 



and Zi z^t — 1 in the ring 

k[zi, . . .,Z5,t,Vi, ...,Vq,Wi,.. .,Wg]. 

Using an elimination order as described in Theorem (2.9) with the computer 
algebra system Macaulay, one obtains a very large Grobner basis. (Because 
of the simple binomial form of the generators, however, the computation 
goes extremely quickly.) However, if we identify the subset S as in the 
theorem, there are only six polynomials corresponding to the Hilbert basis 
elements: 



( 2 . 10 ) 



V3V5V7 - wsw^wr 

V2VeV7 - W2WCW7 
V\VqVs - WiWqWs 



V3V4VS - W3W4WS 
V2V4VQ — W2W4Wg 
V 1 V 3 VQ - WiW^Wg. 



Expressing the corresponding 6-element Hilbert basis in matrix form, we 
see something quite interesting. The matrices we obtain are precisely the six 
3x3 permutation matrices — the matrix representations of the permutations 
of the components of vectors in R^. (This should also agree with your results 
in the first part of Exercise 2.) For instance, the Hilbert basis element 
(0, 0, 1, 0, 1, 0, 1, 0, 0) from the first polynomial in (2.10) corresponds to 
the matrix 



/O 0 1 

Ti3 = 0 1 0 

Vl 0 0 
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which interchanges xi,xs, leaving X 2 fixed. Similarly, the other elements of 
the Grobner basis give (in the order listed above) 



S = 



0 0 1 
1 0 0 
0 1 0 



52 == 



0 1 0 
0 0 1 
1 0 0 



Ti 2 



0 1 0\ /I 0 0 

1 0 0 , T23 = 0 0 1 

001/ Vo 1 0 



I = 



1 0 0 
0 1 0 
0 0 1 



Here S and are the cyclic permutations, Tij interchanges x* and Xj , and 
I is the identity. 

Indeed, it is a well-known combinatorial theorem that the n x n permu- 
tation matrices form the Hilbert basis for the monoid K„ for all n > 2. See 
Exercise 9 below for a general proof. 

This gives us some extremely valuable information to work with. By the 
definition of a Hilbert basis we have, for instance, that in the 3x3 case 
every element M of K 3 can be written as a linear combination 

M = aI + bS + cS^ + dTi 2 + eTia -I- /T23, 



where a, 6, c, d, e, / are nonnegative integers. This is what we meant before 
by saying that we were looking for “building blocks” for the elements of 
our additive monoid of magic squares. The row and column sum of M is 
then given by 



s — CL-\-b-\-c-{-d-\-6-\- f. 

It might appear at first glance that our problem is solved for 3 x 3 
matrices. Namely for a given sum value s, it might seem that we just 
need to count the ways to write s as a sum of at most 6 nonnegative 
integers a, 6, c, d, e, /. However, there is an added wrinkle here that makes 
the problem even more interesting: The 6 permutation matrices are not 
linearly independent. In fact, there is an obvious relation 

/I 1 i\ 

(2.11) / + 5 -I- 5^ = 1 1 1 = Ti 2 + Ti 3 -h T 23 . 

\1 1 1/ 

This means that for all s > 3 there are different combinations of coeffi- 
cients that produce the same matrix sum. How can we take this (and other 
possible relations) into account and eliminate multiple counting? 

First, we claim that in fact every equality 

al + bS + cS^ + dT\2 + cTi3 -h /T23 
= a' I + b'S + c'52 + d'Ti2 + e'Ti3 + /'T23, 



( 2 . 12 ) 
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where a, . . . , f, a\ . . . , f' are nonnegative integers, is a consequence of the 
relation in (2.11), in the sense that if (2.12) is true, then the difference 
vector 



(a, b, c, d, e, /) - (a', b', c', d' , e' , f) 

is an integer multiple of the vector of coefficients (1,1,1,— 1,-1,— 1) in the 
linear dependence relation 

/ + 5 + 52 - Ti 2 - Ti3 - T 23 - 0, 

which follows from (2.11). 

This can be verified directly as follows. 

Exercise 5. 

a. Show that the six 3 x 3 permutation matrices span a 5-dimensional 
subspace of the vector space of 3 x 3 real matrices over R. 

b. Using part a, show that in every relation (2.12) with a, . . . , /' G Z>o, 
(a, 6, c, d, e, /) — (a', 6', c', d', e', /') is an integer multiple of the vector 
(1,1, 1,-1, -1,-1). 

Given this, we can solve our problem in the 3x3 case by “retranslat- 
ing” it into algebra. Namely we can identify the 6-tuples of coefficients 
(a,b,c,d,e,f) e Zio with monomials in 6 new indeterminates denoted 
xi, . . . , xq: 

a = (a, 5, c, d, e, /) x^x^x^x^x^Xq. 

By (2.11), though, we see that we want to think of X1X2X3 and X 4 X 5 XQ as 
being the same. This observation indicates that, in counting, we want to 
consider the element of the quotient ring 

R = k[xi , . . . , xe]/{xiX 2 Xs - X4X5X6) 

represented by the monomial x". Let MSs{s) be the number of distinct 
3x3 integer magic squares with nonnegative entries, and row and column 
sum equal to s. Our next goal is to show that MSs{s) can be reinterpreted 
as the Hilbert function of the above ring R. 

We recall from §4 of Chapter 6 that a homogeneous ideal I C 
fc[xi, . . . , Xn] gives a quotient ring R = fc[xi, . . . , x^]//, and the Hilbert 
function Hr{s) is defined by 

(2.13) Hr{s) = dimfc fc[xi, . . . , Xn]s/Is = dim^ k[xi , . . . , Xn]s ~ dim^ 

where fc[xi, . . . ,Xn]s is the vector space of homogeneous polynomials of 
total degree 5, and is the the vector space of homogeneous polynomials 
of total degree s in I. In the notation of Chapter 9, §3 of [CLO], the Hilbert 
function oi R = fc[xi, . . . , x^]// is written HFi{s). Since our focus here is 
on the ideal /, in what follows, we will call both Hr{s) and HFi{s) the 
Hilbert function of I. It is a basic result that the Hilbert functions of I 
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and (lt(/)) (for any monomial order) are equal. Hence we can compute 
the Hilbert function by counting the number of standard monomials with 
respect to I for each total degree s — that is, monomials of total degree s in 
the complement of (lt(/)). For this and other information about Hilbert 
functions, the reader should consult [CLO], Chapter 9, §3 or Chapter 6, §4 
of this book. 

( 2 . 14 ) Proposition. The function MS3{s) equals the Hilbert function 
Hr{s) = HFj{s) of the homogeneous ideal I = {X1X2X3 — x^x^xq). 

Proof. The single element set {X 1 X 2 X 3 — x^x^xq} is a Grobner basis for 
the ideal it generates with respect to any monomial order. Fix any order 
such that the leading term of the generator is x 1X2X3. Then the standard 
monomials of total degree s in k[xi, . . . , xq] are the monomials of total 
degree s that are not divisible by x 1X2X3. 

Given any monomial x^ = x^x^x^x^x^Xq, let A = min(a, 6, c), and 
construct 



— (n — A, b — Aj c — A, d H- A, e -h A, f + A). 

Since x^ is not divisible by X 1 X 2 X 3 , it is a standard monomial, and you 
will show in Exercise 6 below that it is the remainder on division of x^ by 
X\X2X3 X^X^Xq. 

We need to show that the 3x3 magic squares with row and column sum 
s are in one-to-one correspondence with the standard monomials of degree 
s. Let M be a magic square, and consider any expression 

(2.15) Af = al + bS + cS^ + dT \2 + cTi3 /T23 

with a = (a, . . . , /) e z%o- We associate to M the standard form in R 
of the monomial x^, namely x^' as above. In Exercise 7 you will show 
that this gives a well-defined mapping from the set of magic squares to 
the collection of standard monomials with respect to /, since by Exercise 
5 any two expressions (2.15) for M yield the same standard monomial x^ . 
Moreover the row and column sum of M is the same as the total degree of 
the image monomial. 

This mapping is clearly onto, since the exponent vector a' of any standard 
monomial can be used to give the coefficients in an expression (2.15). It is 
also one-to-one, since if M in (2.15) and 

Ml = a\I -f- biS + ciS^ + d\Ti 2 + eiTi3 + /1T23 

map to the same standard monomial a', then writing A = min(a, 6, c), 
Al = min(ai, 61, ci), we have 

{a — A, b — A, c — A, d + A, e -h A, f -1- A) 

= (tti — Al, bi — Al, Cl — Al, di -h Al, ci + Ai, fi + Ai). 
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It follows that (a, . . . , /) and (ai, . . . , /i) differ by the vector 
(A- Ai)(i,i,i,-i,-i,-i). 

Hence by Exercise 5 again, the magic squares M and M\ are equal. □ 



For readers of Chapter 7, we would like to mention that there is also 
a much more conceptual way to understand the relationship between the 
monoid Ks from our original problem and the ring R and the corresponding 
variety V{xiX 2 Xs — x^x^Xq), using the theory of toric varieties. In partic- 
ular, if ^ {mi, . . . , me} C is the set of integer vectors corresponding 
to the 3x3 permutation matrices as above (the Hilbert basis for K^), and 
we define 0^ : (C*)^ — > by 



as in §3 of Chapter 7, then it follows that the toric variety (the Zariski 
closure of the image of (j)^) is the projective variety V(xiX 2 X 3 — x^x^xq). 
The ideal = {xiX 2 X:^ — X 4 ,x^xq) is called the toric ideal corresponding 
to A. The defining homogeneous ideal of a toric variety is always generated 
by differences of monomials, as in this example. See the book [Stu2] for 
more details. 

To conclude. Proposition (2.14) solves the 3x3 magic square counting 
problem as follows. By the proposition and (2.13), to find M53(s), we 
simply subtract the number of nonstandard monomials of total degree s 
in 6 variables from the total number of monomials of total degree s in 
6 variables. The nonstandard monomials are those divisible by X\X 2 X^\ 
removing that factor, we obtain an arbitrary monomial of total degree 
5 — 3. Hence one expression is the following: 



(2.16) 



-3(.)=cr)-("-n 

=cr)-cr> 



(Also see Exercise 8 below.) For example, MS^{1) = 6 (binomial coeffi- 
cients {^) with m < ^ are zero), MSs{2) = 21, and MSs{3) = 56— 1 = 55. 
This is the first time the relation (2.11) comes into play. 

For readers who have studied Chapter 6 of this book, we should also 
mention how free resolutions can be used to obtain (2.16). The key point 
is that the ideal I = {X 1 X 2 X 3 — x^x^xq) is generated by a polynomial of 
degree 3, so that I ~ fc[xi, . . . , X6](— 3) as fc[xi, . . . , xe] -modules. Hence 
R = k[xi, . . . , xq]/I gives the exact sequence 



0 k[xi , . . . , X6](— 3) ^ k[xi , . . . , oje] — ^ ^ 0. 



Since Hr{s) = HFi{s) = M53(s) by Proposition (2.14), the formula (2.16) 
follows immediately by the methods of Chapter 6, §4. 
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These techniques and more sophisticated ideas from commutative alge- 
bra, including the theory of toric varieties, have also been applied to the 
n X n magic square problem and other related questions from statistics 
and the design of experiments. We refer the reader once more to [Stal] and 
[Stu 2 ] for a more complete discussion; we hope this brief introduction has 
served to whet his or her appetite to explore the extremely fruitful collab- 
oration between algebra, integer programming, and combinatorics that has 
developed in recent years. 



Additional Exercises for §2 

Exercise 6 . Let a and a' be as in the proof of Proposition (2.14). Show 
that 



= q{xi, . . . ,Xq){XiX2Xs - x^x^xq) -f x"^ , 



where 



q = {{xiX2X^)^ ^ 4- {XiX2Xs)^ ‘^(X4X5 Xq) H h (x^xsxe)^ 



. rr^-A b-A c-A d e f 

J/1 ^2 X3 X4J/5X0. 

Deduce that x^ is the standard form of x^ in R. 



Exercise 7. Use Exercise 5 to show that if we have any two expressions 
as in (2.15) for a given M with coefficient vectors a = (a, . . . , /) and 
= (^^ 1 ) • • • j /i)j then the corresponding monomials x" and x^^ have the 
same standard form x^ in R = k[xi^ . . . , X 6 ]/(xiX 2 X 3 — X 4 X 5 X 6 ). 

Exercise 8. There is another formula, due to MacMahon, for the number 
of nonnegative integer magic squares of size 3 with a given sum s: 

Show that this formula and (2.16) are equivalent. Hint: This can be proved 
in several different ways by applying different binomial coefficient identities. 

Exercise 9. Verifying that the Hilbert basis for K 4 = ker(A 4 ) fl 
consists of exactly 24 elements corresponding to the 4x4 permutation ma- 
trices is already a large calculation if you apply the Grobner basis method 
of Theorem (2.9). For larger n, this approach quickly becomes infeasible 
because of the large number of variables needed to make the polynomial 
translation. Fortunately, there is also a non-computational proof that every 
nxn matrix M with nonnegative integer entries and row and column sums 
all equal to s is a linear combination of n x n permutation matrices with 
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nonnegative integer coefficients. The proof is by induction on the number 
of nonzero entries in the matrix. 

a. The base case of the induction is the case where exactly n of the entries 
are nonzero (why?). Show in this case that M is equal to sP for some 
permutation matrix P. 

b. Now assume that the theorem has been proved for all M with k or fewer 
nonzero entries and consider an M with equal row and column sums and 
k 1 nonzero entries. Using the transversal form of Hall’s “marriage” 
theorem (see, for instance, [Bry]), show that there is some collection of 
n nonzero entries in M, one from each row and one from each column. 

c. Continuing from b, let d > 0 be the smallest element in the collection 
of nonzero entries found in that part, let P be the permutation matrix 
corresponding to the locations of those nonzero entries, and apply the 
induction hypothesis to M — dP. Deduce the desired result on M. 

d. A doubly stochastic matrix is an n x n matrix with nonnegative real 
entries, all of whose row and column sums equal 1. Adapt the proof 
sketched in parts a-c to show that the collection of doubly stochastic 
matrices is the convex hull of the set of n x n permutation matrices. 
(See Chapter 7, §1 for more details about convex hulls.) 

Exercise 10. 

a. How many 3x3 nonnegative integer magic squares with sum s are there 
if we add the condition that the two diagonal sums should also equal s? 

b. What about the corresponding question for 4 x 4 matrices? 

Exercise 11. Study the collections of symmetric 3x3 and 4x4 nonneg- 
ative integer magic squares. What are the Hilbert bases for the monoids of 
solutions of the corresponding equations? What relations are there? Find 
the number of squares with a given row and column sum s in each case. 



§3 Multivariate Polynomial Splines 

In this section we will discuss a recent application of the theory of Grobner 
bases to the problem of constructing and analyzing the piecewise polynomial 
or spline functions with a specified degree of smoothness on polyhedral sub- 
divisions of regions in Two- variable functions of this sort are frequently 

used in computer-aided design to specify the shapes of curved surfaces, and 
the degree of smoothness attainable in some specified class of piecewise 
polynomial functions is an important design consideration. For an intro- 
ductory treatment, see [Far]. Uni- and multivariate splines are also used 
to interpolate values or approximate other functions in numerical analysis, 
most notably in the finite element method for deriving approximate solu- 
tions to partial differential equations. The application of Grobner bases to 
this subject appeared first in papers of L. Billera and L. Rose ([BRl], [BR2], 




386 Chapter 8. Integer Programming, Combinatorics, and Splines 



[BR3], [Ros]). For more recent results, we refer the reader to [SS]. We will 
need to use the results on Grobner bases for modules over polynomial rings 
from Chapter 5. 

To introduce some of the key ideas, we will begin by considering the 
simplest case of one-variable spline functions. On the real line, consider the 
subdivision of an interval [a, b] into two subintervals [a, c] U [c, b] given by 
any c satisfying a < c < 6. In rough terms, a piecewise polynomial function 
on this subdivided interval is any function of the form 



(3.1) 




if x G [a, c] 
i^ X e [c, 6], 



where f\ and /2 are polynomials in R[a;]. Note that we can always make 
“trivial” spline functions using the same polynomial fi = /2 on both subin- 
tervals, but those are less interesting because we do not have independent 
control over the shape of the graph of each piece. Hence we will usually be 
more interested in finding splines with fi ^ f 2 - Of course, as stated, (3.1) 
gives us a well-defined function on [a, b] if and only if /i(c) = / 2 (c), and 
if this is true, then / is continuous as a function on [a, 6]. For instance, 
taking a = 0, c = 1, 6 = 2, and 




X + 1 

- X H- 2 



if X G [0, 1] 
if X G [1, 2], 



we get a continuous polynomial spline function. See Fig. 8.3. 

Since the polynomial functions /i ,/2 are functions (that is, they 
have derivatives of all orders) and their derivatives are also polynomials. 




Figure 8.3. A continuous spline function 
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we can consider the piecewise polynomial derivative functions 

f ifxG[a,c] 

\/ 2 ^'’^(x) ifxe[c,6] 

for any r > 0. As above, we see that / is a function on [a, b] (that is, 
/ is r-times differentiable and its rth derivative, , is continuous) if and 
only if fi^\c) = f 2 ^\c) for each 0 < 5 < r. The following result gives a 
more algebraic version of this criterion. 

(3.2) Proposition. The piecewise polynomial function f in (3.1) defines 
a function on [a, 61 if and only if the polynomial f\ — f 2 is divisible by 
{x - cY^^ (that is, /i - /2 G {{x - cY^^) in R[x];. 

For example, the spline function pictured in Fig. 8.3 is actually a 
function since — x 2) — {x 1) = (x — 1)^. We leave the proof of this 
proposition to the reader. 

Exercise 1. Prove Proposition (3.2). 

In practice, it is most common to consider classes of spline functions 
where the fi are restricted to be polynomial functions of degree bounded 
by some fixed integer k. With k = 2 we get quadratic splines, with k = 3 
we get cubic splines, and so forth. 

We will work with two-component splines on a subdivided interval 
[a, 6] = [a, c] U [c, 6] here. More general subdivisions are considered in Exer- 
cise 2 below. We can represent a spline function as in (3.1) by the ordered 
pair (/i, / 2 ) G M[x]^. Fi-om Proposition (3.2) it follows that the splines 
form a vector subspace of R[x]^ under the usual componentwise addition 
and scalar multiplication. (Also see Proposition (3.10) below, which gives 
a stronger statement and which includes this one-variable situation as a 
special case.) Restricting the degree of each component as above, we get 
elements of the finite-dimensional vector subspace Vk of R[x]^ spanned by 

(1,0), (x,0), ..., (x^0), (0,1), (0,x), ..., (0,x'=). 

The splines in Vk form a vector subspace VJT C Vfc. We will focus on 
the following two questions concerning the V^. 

(3.3) Questions. 

a. What is the dimension of VJT ? 

b. Given k, what is the biggest r for which there exist spline functions 
f in for which fi ^ f 2 ? 

We can answer both of these questions easily in this simple setting. First 
note that any piecewise polynomial in Vk can be uniquely decomposed as 
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the sum of a spline of the form (/, /), and a spline of the form (0, g)\ 

(/i, /2) = (/i, /i) + (0, /2 - /i). 

Moreover, both terms on the right are again in 14- Any spline function of 
the form (/, /) is automatically for every r > 0. On the other hand, 
by Proposition (3.2), a spline of the form (0,^) defines a function if 
and only if {x — divides and this is possible only if r H- 1 < A;. If 
r + 1 < A:, any linear combination of (0, (x — , (0, (a; — cY) gives 

an element of and these k — r piecewise polynomial functions, together 
with the (1, 1), (x, x), . . . , (x^, x^) give a basis for V^. These observations 
yield the following answers to (3.3). 



(3.4) Proposition. For one-variable spline functions on a subdivided 
interval [a, b] = [a, c] U [c, 6], The dimension of the space is 



dirnd?) = I ‘jT-V + 1 



if r 1 > k 

if r 1 < k. 



The space contains spline functions not of the form (/, /) if and only 
if r 1 < k. 



For instance, there are quadratic splines for which fi ^ f2, but no 
quadratic splines except the ones of the form (/, /). Similarly there are 
cubic splines for which /i 7^ /2, but no cubic splines of this form. 
The vector space of cubic spline functions is 5-dimensional by (3.4). 
This means, for example, that there is a 2-dimensional space of cubic 
splines with any given values /(a) = A, /(c) = C, f{b) = 5 at x = a, 6, c. 
Because this freedom gives additional control over the shape of the graph 
of the spline function, one-variable cubic splines are used extensively as 
interpolating functions in numerical analysis. 

The reader should have no difficulty extending all of the above to spline 
functions on any subdivided interval [a, 6], where the subdivision is specified 
by an arbitrary partition. 



Exercise 2. Consider a partition 

a — Xq ^ X\ ^ X2 ^ * * * ^ — 1 ^ ^TTi — b 

of the interval [a, b] into m smaller intervals. 

a. Let (/i, . . . , fm) ^ M[x]"^ be an m-tuple of polynomials. Define / on 
[a, b] by setting f\[xi_i,xi] = fii Show that f is 3. function on [a, b] if 
and only if for each i, 1 < i < m — 1, fi+i - fi^ {{x - 

b. What is the dimension of the space of splines with deg fi<k for all 
z? Find a basis. Hint: There exists a nice “triangular” basis generalizing 
what we did in the text for the case of two subintervals. 

c. Show that there is a 2-dimensional space of cubic spline functions 
interpolating any specified values at the x^, z = 0, . . . , n. 
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We now turn to multivariate splines. Corresponding to subdivisions of 
intervals in M, we will consider certain subdivisions of polyhedral regions 
in As in Chapter 7, a polytope is the convex hull of a finite set in 
and by (1.4) of that chapter, a polytope can be written as the intersection 
of a collection of affine half-spaces. In constructing partitions of intervals 
in R, we allowed the subintervals to intersect only at common endpoints. 
Similarly, in R’^ we will consider subdivisions of polyhedral regions into 
poly topes that intersect only along common faces. 

The major new feature in R’^, n > 2 is the much greater geometric free- 
dom possible in constructing such subdivisions. We will use the following 
language to describe them. 

(3.5) Definition. 

a. A polyhedral complex A C R’^ is a finite collection of polytopes such that 
the faces of each element of A are elements of A, and the intersection 
of any two elements of A is an element of A. We will sometimes refer 
to the A:-dimensional elements of a complex A as k- cells. 

b. A polyhedral complex A C R^ is said to be pure n-dimensional if every 
maximal element of A (with respect to inclusion) is an n-dimensional 
polyhedron. 

c. Two n-dimensional poly topes in a complex A are said to be adjacent if 
they intersect along a common face of dimension n — 1. 

d. A is said to be a hereditary complex if for every r G A (including the 
empty set), any two n-dimensional polytopes a, cr' of A that contain r 
can be connected by a sequence a = cri, < 72 , • • • , (^m = cr' in A such 
that each cr^ is n-dimensional, each cr^ contains r, and cr^ and (Ji^i are 
adjacent for each i. 

The cells of a complex give a particularly well-structured subdivision of 
the polyhedral region R — C R”^. 

Here are some examples to illustrate the meaning of these conditions. For 
example. Fig. 8.4 on the next page is a picture of a polyhedral complex 
in R^ consisting of 18 polytopes in all — the three 2-dimensional polygons 
CT 2 ? ^^ 3 ? eight 1-cells (the edges), six 0-cells (the vertices at the endpoints 
of edges), and the empty set, 0. 

The condition on intersections in the definition of a complex rules out 
collections of polyhedra such as the ones in Fig. 8.5. In the collection on 
the left (which consists of two triangles^ their six edges, their six vertices 
and the empty set), the intersection of the two 2-cells is not a cell of the 
complex. Similarly, in the collection on the right (which consists of two 
triangles and a rectangle, together with their edges and vertices, and the 
empty set) the 2-cells meet along subsets of their edges, but not along entire 
edges. 

A complex such as the one in Fig. 8.6 is not pure, since r is maximal 
and only 1-dimensional. 
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( 0 , 2 ) 




( 2 , 0 ) 



Figure 8.4. A polyhedral complex in 



A complex is not hereditary if it is not connected, or if it has maximal 
elements meeting only along faces of codimension 2 or greater, with no 
other connection via n-cells, as is the case for the complex in Fig. 8.7. 
(Here, the cells are the two triangles, their edges and vertices, and finally 
the empty set.) 

Let A be any pure n-dimensional polyhedral complex in R’^, let 
cTi, . . . , am be a given, fixed, ordering of the n-cells in A, and let R = 
U^icr^. Generalizing our discussion of univariate splines above, we in- 
troduce the following collections of piecewise polynomial functions on 
R. 

(3.6) Definition. 

a. For each r > 0 we will denote by C^{A) the collection of functions 
f on R (that is, functions such that all rth order partial derivatives 
exist and are continuous on R) such that for every 6 e A including 
those of dimension < n, the restriction /[^ is a polynomial function 
fs ^ R[xi, • • • , 

b. C^(A) is the subset of / G (7’'(A) such that the restriction of / to each 
cell in A is a polynomial function of degree k or less. 
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Figure 8.5. Collections of polygons that are not complexes 



Our goal is to study the analogues of Questions (3.3) for the C'^(A). 
Namely, we wish to compute the dimensions of these spaces over M, and to 
determine when they contain nontrivial splines. 

We will restrict our attention in the remainder of this section to com- 
plexes A that are both both pure and hereditary. If are adjacent 

n-cells of A, then they intersect along an interior (n — l)-cell dij G A, a 
polyhedral subset of an affine hyperplane where G M[xi, . . . , Xn] 



Figure 8.6. A non-pure complex 
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Figure 8.7. A non-hereditary complex 



is a polynomial of total degree 1. Generalizing Proposition (3.2) above, we 
have the following algebraic characterization of the elements of C^(A) in 
the case of a pure, hereditary complex. 

(3.7) Proposition. Let A be a pure, hereditary complex with m n-cells Gi. 
Let f G C’^(A), and for each i, 1 < i < m, let fi = /|o-. G M[xi, . . . , Xn]- 
Then for each adjacent pair Gi, gj in A, fi — fj G Conversely, any 

m-tuple of polynomials (/i, . . . , fm) satisfying fi — fj G for each 

adjacent pair Gi, gj ofn-cells in A defines an element f G C^(A) when we 
set /U, = fi. 

The meaning of Proposition (3.7) is that for pure n-dimensional com- 
plexes A C piecewise polynomial functions are determined by their 
restrictions to the n-cells Gi,...,Gm in A. In addition, for hereditary 
complexes, the property for piecewise polynomial functions / may be 
checked by comparing only the restrictions fi = and fj = /|^^. for 
adjacent pairs of n-cells. 

Proof. If / is an element of C’^(A), then for each adjacent pair Gi,Gj 
of n-cells in A, fi — fj and all its partial derivatives of order up to and 
including r must vanish on Gi fl Gj. In Exercise 3 below you will show that 
this implies fi — fj is an element of 

Conversely, suppose we have fi, . . . , fm ^ M[x\ , , x^] such that fi — fj 
is an element of for each adjacent pair of n-cells in A. In Exercise 3 

below, you will show that this implies that fi and its partial derivatives of 
order up to and including r agree with fj and its corresponding derivatives 
at each point of Gi D Gj. But the fi, . . . , fm define a function on R if 
and only if for every 6 E A and every pair of n-cells Gp, Gq containing 6 
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(not only adjacent ones) fp and its partial derivatives of order up to and 
including r agree with fq and its corresponding derivatives at each point 
of 6. So let p, q be any pair of indices for which 6 C dp 0 aq. Since A is 
hereditary, there is a sequence of n-cells 

dp , di2 5 • • • j 

each containing 6, such that di. and are adjacent. By assumption, 

this implies that for each j, fi. — and all its partial derivatives of 
orders up to and including r vanish on di. fl D But 

fp ~ fq — {fii ~ fi2) ifi2 ~ fh) “I" ' ' ’ {fik-1 ~ fik) 

and each term the right and its partials up to and including order r vanish 
on 6. Hence /i, . . . , /m define an element of C'^(A). □ 

Exercise 3. Let cr, d' be two adjacent n-cells in a polyhedral complex A, 
and let d H d' CV{£) for a linear polynomial £ G R[xi , . . . , x^]- 

a. Show that if /, /' G M[xi, . . . ,Xn] satisfy / — /' G then the 

partial derivatives of all orders < r of / and /' agree at every point in 
d n cr'. 

b. Conversely if the partial derivatives of all orders < r of / and /' agree 
at every point in cr fl a', show that / — /' G {£'^^^). 



Fixing any one ordering on the n-cells di in A, we will represent elements 
/ of C’^(A) by ordered m-tuples (/i, . . . , fm) G M[xi, . . . , where 

fi = fieri- 

Consider the polyhedral complex A in from Fig. 8.4, with the num- 
bering of the 2-cells given there. It is easy to check that A is hereditary. 
The interior edges are given by cri D a 2 C V (x) and a 2 fl era C V {y). By the 
preceding proposition, an element {fi, f 2 , fs) ^ E[x, 2 /]^ gives an element 
of C’^(A) if and only if 

/i - /2 e and 

/2 - /a G 

To prepare for our next result, note that these inclusions can be rewritten 
in the form 

/l - /2 + X"+V4 = 0 
/2 - /a + ^’"■'■^5 = 0 



for some / 4 , G E[x, y]. These equations can be rewritten again in vector- 
matrix form as 
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Thus, elements of (7^ (A) are projections onto the first three components 
of elements of the kernel of the map R[x, y]^ M[x, defined by 

(3-8) M(A,r) (^0 1-1 0 2/'’+ V ' 

By Proposition (1.10) and Exercise 9 of §3 of Chapter 5, it follows that 
(7^ (A) has the structure of a module over the ring R[x, y\. This observation 
allows us to apply the theory of Grobner bases to study splines. 

Our next result gives a corresponding statement for (7’’ (A) in general. We 
begin with some necessary notation. Let A be a pure, hereditary polyhedral 
complex in R’^. Let m be the number of n-cells in A, and let e be the 
number of interior {n — l)-cells (the intersections fl cFj for adjacent n- 
cells). Fix some ordering ri, . . . , Tg for the interior (n — l)-cells and let is be 
a linear polynomial defining the affine hyperplane containing Tg. Consider 
the e X (m + e) matrix M(A, r) with the following block decomposition: 

(3.9) M(A,r) = (5(A) ID). 

(Note: the orderings of the rows and columns are determined by the or- 
derings of the indices of the n-cells and the interior (n — l)-cells, but any 
ordering can be used.) In (3.9), d{A) is the e x m matrix defined by this 
rule: In the sth row, if Ts = (Ti H <jj with i < j, then 

{ +1 if k = i 
-1 if k = j 
0 otherwise. 



In addition, D is the e x e diagonal matrix 



D = 



(V 




V 0 0 









Then as in the example above we have the following statement. 



(3.10) Proposition. Let A be a pure, hereditary polyhedral complex in 
R’^, and let M(A, r) be the matrix defined in (3.9) above. 

a. An m-tuple (/i,...,/m) *5 in C^(A) if and only if there exist 

(/m+l) • • • j /m+e) SUch that f = (/i ) • • • > /mj /m+1) • • • ? /m+e) 

element of the kernel of the map R[xi, . . . , Xn]^^^ R[xi, . . . , Xn]^ 
defined by the matrix M(A, r). 

b. C’^(A) has the structure of a module over the ring R[xi, . . . , Xn]. In the 

language of Chapter 5, it is the image of the projection homomorphism 
from R[xi, . . . , onto R[xi, . . . , Xn]^ (in the first m components) 

of the module of syzygies on the columns o/M(A,r). 

c. (7^ (A) is a finite- dimensional vector subspace of (A). 
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Proof. Part a is essentially just a restatement of Proposition (3.7). For 
each interior edge D CTj, {i < j) we have an equation 



fi - fj = 



for some /m+s ^ M[a:i, . . . ,Xn]. This is the equation obtained by setting 
the sth component of the product M(A, r)f equal to zero. 

Part b follows immediately from part a as in Chapter 5, Proposition 
(1.10) and Exercise 9 of Chapter 5, §3. 

Part c follows by a direct proof, or more succinctly from part b, since 
C^(A) is closed under sums and products by constant polynomials. □ 



The Grobner basis algorithm based on Schreyer’s Theorem (Chapter 5, 
Theorem (3.3)) may be applied to compute a Grobner basis for the kernel 
of M(A, r) for each r, and from that information the dimensions of, and 
bases for, the C^(A) may be determined. 

As a first example, let us compute the C’^(A) for the complex A C 
from (3.8). We consider the matrix as in (3.8) with r = 1 first. Using any 
monomial order in R[a:,2/]^ with es > ... > ei, we compute a Grobner 
basis for ker(M(A, 1) (that is, the module of syzygies of the columns of 
M(A, 1)) and we find three basis elements, the transposes of 

g, = ( 1 , 1 , 1 , 0 , 0 ) 

92 = 0 , 0 , 1 , 0 ) 

93 = -y‘^,0,0, 1 ). 

(In this simple case, it is easy to write down these syzygies by inspection. 
They must generate the module of syzygies because of the form of the 
matrix M(A,r) — the last three components of the vector / are arbitary, 
and these determine the first two.) The elements of C^(A) are given by 
projection on the first three components, so we see that the general element 
of C^(A) will have the form 

, , /(l, 1,1)+ 0, 0) + h(-y^, 0) 

^ ={f-9X^-hy\f-hy\f), 

where f^g^h G R[x, y]‘^ are arbitrary polynomials. Note that the triples 
with g = h = 0 are the “trivial” splines where we take the same polynomial 
on each cr^, while the other generators contribute terms supported on only 
one or two of the 2-cells. The algebraic structure of C^(A) as a module over 
E[a:, y] is very simple — C^(A) is a free module and the given generators 
form a module basis. (Billera and Rose show in Theorem 4.4 of [BR3] that 
the same is true for C'^(A) for any hereditary complex A C and all r > 
1.) Using the decomposition it is also easy to count the dimension of Cl{A) 
for each k. For A: = 0, 1, we have only the “trivial” splines, so dim Cq (A) = 
1, and dim Cj (A) = 3 (a vector space basis is {(1, 1, 1), (x, x, x), (y, y, y)}). 
For fc > 2, there are nontrivial splines as well, and we see by counting 
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monomials of the appropriate degrees in f^g^h that 

Also see Exercise 9 below for a more succinct way to package the 
information from the function dimC^(A). 

For larger r, the situation is entirely analogous in this example. A 
Grobner basis for the kernel of M(A, r) is given by 

51 = (1,1, 1,0, of 

92 = 0,0,1, Of 

93 = (-f+',-f+',0,0. If, 

and we have that (7^ (A) is a free module over E[x, y] for all r > 0. Thus 

’ (''f ) if A: < r + 1 

dim(7^(A) = < 

[ ifA:>r + l. 

Our next examples, presented as exercises for the reader, indicate some of 
the subtleties that can occur for more complicated complexes. (Additional 
examples can be found in the exercises at the end of the section.) 

Exercise 4. In consider the convex quadrilateral 

R = Conv({(2, 0), (0, 1), (-1, 1), (-1, -2)}) 

(notation as in §1 of Chapter 7), and subdivide R into triangles by con- 
necting each vertex to the origin by line segments. We obtain in this way 
a pure, hereditary polyhedral complex A containing four 2-cells, eight 1- 
cells (four interior ones), five 0-cells, and 0. Number the 2-cells cri, . . . , 0-4 
proceeding counter-clockwise around the origin starting from the triangle 
cTi = Conv({(2, 0), (0, 0), (0, 1)}). The interior 1-cells of A are then 

(Ji n (72 C V (x) 

(72 n (73 C Y{x + y) 

(73 n (74 C V {2x — y) 

(7i n (74 c v(?/). 

a. Using this ordering on the interior 1-cells, show that we obtain 

/I -1 0 0 0 0 0 \ 

0 1-10 0 {x-\-yy-^^ 0 0 

0 0 1 -1 0 0 {2x- t/)^+i 0 

\1 0 0 -1 0 0 0 

for the matrix M{A,r). 
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b. With r = 1, for instance, show that a Grobner basis for the M[x, 2/1- 
module of syzygies on the columns of M(A, 1) is given by the transposes 
of the following vectors 

= ( 1 , 1 , 1 , 1 , 0 , 0 , 0 , 0 ) 

Q 2 = (l/4)(3y^, 6x^ + 3|/^, 4x^ - Axy + y^, 0, 6, -2, -1, -3) 

= (2xy^, x^y + 2xy^ + y^, 0, 0, y, -y, 0, -2x - y) 

Qa = (- 3 x 1 /^ - 2y^, x^ - 3xy^ - 2y^, 0 , 0 , x, -x + 2y, 0 , 3x + 2y). 



c. As before, the elements of C^(A) are obtained by projection onto the 
first four components. From this, show that there are only “trivial” 
splines in Co(A) and Ci(A), but Q 2 and its multiples give nontrivial 
splines in all degrees fc > 2, while ^3 and ^4 also contribute terms in 
degrees k > 3. 

d. Show that the form a basis for C^(A), so it is a free module. Thus 



dimC^(A) = { 



1 

3 

7 



I Cf) + Q + 



if A; = 0 
if A; = 1 
if A; = 2 
if A; > 3. 



We will next consider a second polyhedral complex A' in which has 
the same combinatorial data as A in Exercise 4 (that is, the numbers of 
A;-cells are the same for all k, the containment relations are the same, and 
so forth), but which is in special position. 

Exercise 5. In R^, consider the convex quadrilateral 

R = Conv({(2, 0), (0, 1), (-1, 0), (0, -2)}). 

Subdivide R into triangles by connecting each vertex to the origin by line 
segments. This gives a pure, hereditary polyhedral complex A' with four 
2-cells, eight 1-cells (four interior ones), five 0-cells, and 0. Number the 
2-cells cTi , . . . , (74 proceeding counter-clockwise around the origin starting 
from the triangle cri = with vertices (2, 0), (0, 0), (0, 1). The interior 1-cells 
of A are then 



(7i n (72 C V(x) 
(72 n (73 C Y{y) 
(73 n (74 c V ( x ) 
( 7 i n (74 C Y{y). 



This is what we meant before by saying that A' is in special position — the 
interior edges lie on only two distinct lines, rather than four of them. 
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a. Using this ordering on the interior 1-cells, show that we obtain 



M(A', r) 



/I -1 0 

0 1 -1 
0 0 1 
\1 0 0 



0 0 0 0 \ 
0 0 2 /’’+! 0 0 

-10 0 0 

-1 0 0 0 



b. With r = 1, for instance, show that a Grobner basis for the R[x, y]- 
module of syzygies on the columns of M(A', 1) is given by the transposes 
of 



g[ = ( 1 , 1 , 1 , 1 , 0 , 0 , 0 , 0 ) 

g '2 = (0,x^,x^,0, 1,0, -1,0) 

93 = 0 , 0 , 0 ,- 1 , 0 ,- 1 ) 

94 = 0 , 0 , 0 , -y^, 0 , 0 , -x^) 



Note that these generators have a different form (in particular, the com- 
ponents have different total degrees) than the generators for the syzygies 
on the columns of M(A, 1). 

c. Check that the g[ form a basis of (7^ (A'), and that 



dimC'^(A') ^ 



I ) + 2© + ('=-) 



if fc = 0 
if A: = 1 
if A: = 2 
if A; - 3 
if A; > 3. 



Comparing Exercises 4 and 5, we see that the dimensions of C^(A) can 
depend on more than just the combinatorial data of the polyhedral complex 
A — they can vary depending on the positions of the interior (n — l)-cells. 

The recent paper [Ros] of Lauren Rose sheds some light on examples like 
these. To describe her results, it will be convenient to use the following 
notion. 



(3.12) Definition. The dual graph Ga of a pure n-dimensional complex 
A is the graph with vertices corresponding to the n-cells in A, and edges 
corresponding to adjacent pairs of n-cells. 

For instance, the dual graphs for the complexes in Exercises 4 and 5 are 
both equal to the graph in Fig. 8.8. By an easy translation of the definition 
in (3.5), the dual graph of a hereditary complex is connected. 

As before, we will denote by e the number of interior (n — l)-cells and let 
<5i, . . . , denote some ordering of them. Choose an ordering on the ver- 
tices of Ga (or equivalently on the n-cells of A), and consider the induced 
orientations of the edges. If 6 = jk is the oriented edge from vertex j to 
vertex k in Ga, corresponding to the interior (n — l)-cell 6 = aj f) let 
be the equation of the affine hyperplane containing 6. By convention, we 
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02 



Oi 




04 



Figure 8.8. The dual graph 



take the negative^ as the defining equation for the affine hyperplane 
containing the edge kj with reversed orientation. For simplicity, we will 
also write ii for the linear polynomial £si • Finally, let C denote the set of 
cycles in Ga- Then, following Rose, we consider a module built out 

of syzygies on the 

(3.13) Definition. B^(A) C R[a;i, . . . , XnY is the submodule defined by 

^'■(A) = {(gi, ...,ge) € M[xi, . . . ,x„]® : for all c € = 0}. 

6ec 



In Theorem 2.2 of [Ros], Rose establishes the following connection 
between B^{A) and the module G’^(A) of spline functions. 

(3.14) Theorem. If A is hereditary, then G^(A) is isomorphic to 
B'^{A) 0 M[xi, . . . , Xn] as an . . . , Xn\-module. 

Proof. Consider the mapping 

(p : G^(A) ^ ^'’(A) 0 M[xi , . ..,Xn] 

defined in the following way. By (3.7), for each f = (/i, . . . , fm) in G^(A) 
and each interior (n — l)-cell 6i = aj fl cr^, we have fj — fk = for 

some Qi E M[xi, . . . , Xn]> Let 

¥>(/) = {{gi, ■ ■ ■ , 9e), fl) 

(the fl is the component in the M[xi , . . . , Xn] summand). For each cycle c 
in the dual graph, equals a sum of the form ^{fj — fk), which 
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cancels completely to 0 since c is a cycle. Hence, the e-tuple (^i, . . . ,^e) 
is an element of It is easy to see that is a homomorphism of 

R[xi ^ . . . , Xn]-modules. 

To show that ip is an isomorphism, consider any 

((51, • • • , 9e), f) e -B'’(A) © E[xi, . . . , Xn\. 

Let fi = f. For each z, 2 < i < m, since Ga is connected, there is 
some path from vertex a\ to ai in Ga? using the edges in some set E. 
Let fi = f ^ where as above the are defined by fj — 

h = 9sC^ a 6 is the oriented edge jk. Any two paths between these two 
vertices differ by a combination of cycles, so since (^ 1 , . . . , ^e) C 
fi well-defined polynomial function on cr^, and the m-tuple (/i, • • • , /m) 
gives a well-defined element of C’^(A) (why?). We obtain in this way a 
homomorphism 



i; : H"(A)©M[xi,...,x,] ^ G"(A), 
and it is easy to check that ^ and p are inverses. □ 

The algebraic reason for the special form of the generators of the module 
G^(A) in Exercise 5 as compared to those in Exercise 4 can be read off 
easily from the alternate description of G^(A) given by Theorem (3.14). 
For the dual graph shown in Fig. 8.8 on the previous page, there is exactly 
one cycle. In Exercise 4, numbering the edges counterclockwise, we have 

ej = el = {x + yf, el = (2x - yf, el = y\ 

It is easy to check that the dimension over R of the subspace of H(A) with 
Qi constant for all z is 1, so that applying the mapping -0 from the proof 
of Theorem (3.14), the quotient of the space G^(A) of quadratic splines 
modulo the trivial quadratic splines is 1-dimensional. (The spline Q 2 from 
part b of the exercise gives a basis.) On the other hand, in Exercise 5, 

el = el = y\ el = x\ e\ = y\ 

so B^{A) contains both (1, 0, —1, 0) and (0, 1, 0, —1). Under -0, we obtain 
that the quotient of ^^(A) modulo the trivial quadratic splines is two- 
dimensional. 

As an immediate corollary of Theorem (3.14), we note the following 
general sufficient condition for C'^(A) to be a free module. 

(3.15) Corollary. If A is hereditary and Ga is a tree (i.e., a connected 
graph with no cycles), then G^{A) is free for all r > 0. 

Proof. If there are no cycles, then B^(A) is equal to the free module 
R[xi, . . . , XnY, and the corollary follows from Theorem (3.14). This result 
is Theorem 3.1 of [Ros]. □ 
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Returning to bivariate splines, for generic pure 2-dimensional hereditary 
simplicial complexes A in (that is, complexes where all 2-cells are trian- 
gles whose edges are in sufficiently general position) giving triangulations of 
2-manifolds with boundary in the plane, there is a simple combinatorial for- 
mula for dimC^(A) first conjectured by Strang, and proved by Billera (see 
[Bill]). The form of this dimension formula given in [BRl] is the following: 

(3.16) dimC^(A) = ^ ^ ^ + (/ii - /i2) + 2/i2 ^ ^ 

Here h\ and h 2 are determined by purely combinatorial data from A: 

(3.17) /ii = F - 3 and h2 = 3 - 2F + £*, 

where V is the number of 0-cells, and E is the number of 1-cells in A. 
(Also see Exercise 12 below for Strang’s original dimension formula, and 
its connection to (3.16).) 

For example, the simplicial complex A in Exercise 4, in which the interior 
edges lie on four distinct lines (the generic situation) has V = h and E = S, 
so hi = 2 and /12 = 1. Hence (3.16) agrees with the formula from part d 
of the exercise. On the other hand, the complex A' from Exercise 5 is not 
generic as noted above, and (3.16) is not valid for A'. 

Interestingly enough, there is no corresponding statement for n > 3. 
Moreover, the modules C’^(A) can fail to be free modules even in very 
simple cases (see Exercise 10 c below for instance). As of this writing, the 
dependence of the dimensions of C^(A) on the geometry of the complex A 
is still very little understood forn > 3, and there are many open questions, 
although the recent paper [SS] and further work by Schenck and Stillman 
is providing new insights. 



Additional Exercises for §3 

Exercise 6. Investigate the modules C^(A) and C’^(A'), r > 2, for the 
complexes from Exercises 4 and 5. What are dim (7^ (A) and dimC^(A')? 

Exercise 7. Let A be the simplicial complex in R^ given in Fig. 8.9. The 
three interior vertices are at (1/3, 1/6), (1/2, 1/3), and (1/6, 1/2). 

a. Find the matrix M(A, r) for each r > 0. 

b. Show that 

dim^^(^)= f O' 

(where if fc < 3, by convention, the second term is taken to be zero). 

c. Verify that formula (3.16) is valid for this A. 



Exercise 8. In the examples we presented in the text, the components of 
our Grobner basis elements were all homogeneous polynomials. This will 
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Figure 8.9. Figure for Exercise 7 



not be true in general. In particular, this may fail if some of the interior 
(n — l)-cells of our complex A lie on hyperplanes which do not contain the 
origin in W^. Nevertheless, there is a variant of homogeneous coordinates 
used to specify points in projective spaces — see [CLO] Chapter 8 — that 
we can use if we want to work with homogeneous polynomials exclusively. 
Namely, think of a given pure, hereditary complex A as a subset of the 
hyperplane Xn+i = 1, a copy of in R^+^. By considering the cone a 
over each fc-cell a E A with vertex at (0, . . . , 0, 0) in we get a new 

polyhedral complex A in 

a. Show that n-cells cr, a' from A are adjacent if and only the corresponding 
a, a' are adjacent (n + l)-cells in A. Show that A is hereditary. 

b. What are the equations of the interior n-cells in A? 

c. Given / = (/i . . . , fm) ^ show that the component- wise ho- 

mogenization with respect to Xn+i, = {fi • • • ? fm)^ gives an element 

ofC^A). 

d. How are the matrices M(A, r) and M(A', r) related? 

e. Describe the relation between dim (7^ (A) and dimC^(A). 

Exercise 9. In this exercise we will assume that the construction of 
Exercise 8 has been applied, so that C’^(A) is a graded module over 
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M[a:o, . . . , Xn]> Then the formal power series 

oo 

H{C^{A),u) = 

k=0 

is the Hilbert series of the graded module C’^(A). This is the terminology 
of Exercise 24 of Chapter 6, §4, and that exercise showed that the Hilbert 
series can be written in the form 

(3.18) H{C^{A), u) = Piu)/{1 - 

where P{u) is a polynomial in u with coefficients in Z. We obtain the series 
from (3.18) by using the formal geometric series expansion 

oo 

k=0 

a. Show that the Hilbert series for the module C^(A) from (3.8) with r = 1 
is given by 

(1 + 2u^)/{l - uf. 

b. Show that the Hilbert series for the module (A) from Exercise 4 is 

(1 + -f 2u^)/{l — u)^, 

c. Show that the Hilbert series for the module (7^ (A') from Exercise 5 is 

(1 -f 2u^ -f 

d. What is the Hilbert series for the module (7^ (A) from Exercise 7 above? 

Exercise 10. Consider the polyhedral complex A in formed by sub- 
dividing the octahedron with vertices zbe^, i = 1, 2, 3 into 8 tetrahedra by 
adding an interior vertex at the origin. 

a. Find the matrix M(A, r). 

b. Find formulas for the dimensions of (7^ (A) and (7f(A). 

c. What happens if we move the vertex of the octahedron at 63 to (1, 1, 1) 
to form a new, combinatorially equivalent, subdivided octahedron A'? 
Using Macaulay^ s hi lb command, compute the Hilbert series of the 
graded module kerM(A', 1) and from the result deduce that (7^ (A') 
cannot be a free module. Hint: In the expression (3.19) for the dimension 
series of a free module, the coefficients in the numerator P{t) must all 
be positive; do you see why? 

Exercise 11. This exercise uses the language of exact sequences and some 
facts about graded modules from Chapter 6. The method used in the text 
to compute dimensions of (7^ (A) requires the computation of a Grobner 
basis for the module of syzygies on the columns of M(A, r), and it yields 
information leading to explicit bases of the spline spaces (7[(A). If bases for 
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these spline spaces are not required, there is another method which can be 
used to compute the Hilbert series directly from M(A, r) without comput- 
ing the syzygy module. We will assume that the construction of Exercise 8 
has been applied, so that the last e columns of the matrix M(A, r) consist 
of homogeneous polynomials of degree r -f 1. Write R = R[a;i, . . . , Xn\ and 
consider the exact sequence of graded i2-modules 

0 ker M(A, r) ^ RJ^ ® R{-r - ly ^ im M(A, r) 0. 

a. Show that the Hilbert series of R^ 0 R{—r — ly is given by 

(m + etx’’+^)/(l - uy-^^ 

b. Show that the Hilbert series of the graded module ker M(A, r) is the 
difference of the Hilbert series from part a and the Hilbert series of the 
image of M(A, r). 

The Hilbert series of the image can be computed by applying Buchberger’s 
algorithm to the module M generated by the columns of M(A,r), then 
applying the fact that M and (lt(M)) have the same Hilbert function. 

Exercise 12. Strang’s original conjectured formula for the dimension of 
C^(A) for a simplicial complex in the plane with F triangles, Eq interior 
edges ^ and Vq interior vertices was 

(3.19) dimC^(A) = - {2k + l)Eo + 3Vo, 

and this is the form proved in [Bill]. In this exercise, you will show that 
this form is equivalent to (3.16), under the assumption that A gives a 
triangulation of a topological disk in the plane. Let E and V be the total 
numbers of edges and vertices respectively. 

a. Show that V -E-\-F = 1 and Vb — ^ = 1 for such a triangulation. 

Hint: One approach is to use induction on the number of triangles. In 
topological terms, the first equation gives the usual Euler characteristic, 
and the second gives the Euler characteristic relative to the boundary. 

b. Use part a and the edge-counting relation 3F = 2E Fq to show that 

E - 3 -b 2Eo - 3Uo and U - 3 -h E’o - 2Vq. 

c. Show that if F is eliminated using part a, and the expressions for V 
and E from part b are substituted into (3.16), then (3.19) is obtained. 
Conversely, show that (3.19) implies (3.16). 

Exercise 13. The methods introduced in this section work for some alge- 
braic^ but non-poly hedral, decompositions of regions in as well. We will 
not essay a general development. Instead we will indicate the idea with a 
simple example. In suppose we wanted to construct piecewise poly- 
nomial functions on the union R of the regions cri, ct 2 , cts as in Fig. 8.10 
on the next page. The outer boundary is the circle of radius 1 centered at 
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Figure 8.10. Figure for Exercise 13 



the origin, and the three interior edges are portions of the curves y = 

X = —2/^, and y = x^^ respectively. 

We can think of this as a non-linear embedding of an abstract 
2-dimensional polyhedral complex. 

a. Show that a triple (/i, /2, fs) G R[x, y]^ defines a spline function on 
R if and only if 

/i-/2e((2/-xY+') 
f2-he ((X + y^r+^) 
h-he{(y-x^y+^). 

b. Express the splines on this subdivided region as the kernel of an ap- 
propriate matrix with polynomial entries, and find the Hilbert function 
for the kernel. 

Exercise 14. (The Courant functions and the face ring of a complex, 
see [Stal]) Let A be a pure n-dimensional, hereditary complex in W^. Let 
Vi, ... ,Vq be the vertices of A (the 0-cells). 

a. For each i, 1 < i < 9 , show that there is a unique function Xi G C'? (A) 
(that is, Xi is continuous, and restricts to a linear function on each 
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n-cell) such that 

^ / \ f 1 if ^ j 

The Xi are called the Courant functions of A. 

b. Show that 



+ . . . + JCg = 1, 

the constant function 1 on A. 

c. Show that if {vi ^ , . . . , } is any collection of vertices which do not 

form the vertices of any fc-cell in A, then 

X,, -0, 

the constant function 0 on A. 

d. For a complex A with vertices vi, . . .Vq, following Stanley and Reisner, 
we can define the face ring of A, denoted M[A], as the quotient ring 

R[A] = R[xi, . . . ,Xq]/lA, 

where I a is the ideal generated by the monomials Xi^Xi^ ' ‘ ‘ corre- 
sponding to collections of vertices which are not the vertex set of any 
cell in A. Show using part c that there is a ring homomorphism from 
E[A] to R[Xi , . . . , Xq] (the subalgebra of (7^(A) generated over R by 
the Courant functions) obtained by mapping Xi to Xi for each i. 

Billera has shown that in fact C^(A) equals the the algebra generated by 
the Courant functions over M, and that the induced mapping 

(f : R[A]/(xi H h Xg - 1) -> C°(A) 

(see part b) is an isomorphism of R-algebras. See [Bil2]. 




Chapter 9 

Algebraic Coding Theory 



In this chapter we will discuss some applications of techniques from compu- 
tational algebra and algebraic geometry to problems in coding theory. After 
a preliminary section on the arithmetic of finite fields, we will introduce 
some basic terminology for describing error-correcting codes. We will study 
two important classes of examples — linear codes and cyclic codes — where 
the set of codewords possesses additional algebraic structure, and we will 
use this structure to develop good encoding and decoding algorithms. Fi- 
nally, we will introduce the Reed-Muller and geometric Goppa codes, where 
algebraic geometry is used in the construction of the code itself. 

§1 Finite Fields 

To make our presentation as self-contained as possible, in this section we 
will develop some of the basic facts about the arithmetic of finite fields. We 
will do this almost “from scratch,” without using the general theory of field 
extensions. However, we will need to use some elementary facts about finite 
groups and quotient rings. Readers who have seen this material before may 
wish to proceed directly to §2. More complete treatments of this classical 
subject can be found in many texts on abstract algebra or Galois theory. 

The most basic examples of finite fields are the prime fields ¥p = ^/(p), 
where p is any prime number, but there are other examples as well. To 
construct them, we will need to use the following elementary fact. 

Exercise 1. Let k be any field, and let g G k[x] be an irreducible poly- 
nomial (that is, a non-constant polynomial which is not the product of 
two nonconstant polynomials in k[x]). Show that the ideal (g) C k[x] is a 
maximal ideal. Deduce that k[x]/{g) is a field if g is irreducible. 

For example, let p = 3 and consider the polynomial g = x‘^-\-x-{-2E 
F 3 [x]. Since p is a quadratic polynomial with no roots in F 3 , p is irreducible 
in ¥s[x]. By Exercise 1, the ideal (p) is maximal, hence F = F 3 [a;]/(p) is a 
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field. As we discussed in Chapter 2, the elements of a quotient ring such as 
F are in one-to-one correspondence with the possible remainders on division 
by g. Hence the elements of F are the cosets of the polynomials ax -I- b, 
where a, h are arbitrary in F3. As a result, F is a field of 3^ = 9 elements. 

To distinguish more clearly between polynomials and the elements of our 
field, we will write a for the element of F represented by the polynomial x. 
Thus every element of F has the form aa-\-h for a, 6 G F3. Also, note that 
a satisfies the equation g{a) = a^ + a + 2 = 0. 

The addition operation in F is the obvious one: {aa H- 6) -h (a'a + b') = 
(a -f a')a + (b -h b'). As in Chapter 2 §2, we can compute products in F by 
multiplication of polynomials in a, subject to the relation g{a) = 0. For 
instance, you should verify that in F 

(a -f 1) • (2a -h 1) = 2a^ + 1 = a 

(recall that the coefficients of these polynomials are elements of the field 
F3, so that 1 -h 2 = 0). Using this method, we may compute all the powers 
of a in F, and we find 





= 2a -h 1 




— 2a -f- 2 




= 2 


a® 


= 2a 


a® 


= a + 2 




= a "h 1, 



and a® = 1. For future reference, weuote that this computation also shows 
that the multiplicative group of nonzero elements of F is a cyclic group of 
order 8, generated by a. 

The construction of F in this example may be generalized in the fol- 
lowing way. Consider the polynomial ring ¥p[x], and let g G Fp[o:;] be an 
irreducible polynomial of degree n. The ideal {g) is maximal by Exercise 
1, so the quotient ring F = ¥p[x]/{g) is a field. The elements of F may be 
represented by the cosets modulo (g) of the polynomials of degree n — 1 or 
less: an-ix^~^ + • • • + aix -f- ao, G Fp. Since the ai are arbitrary, this 
implies that F contains distinct elements. 

Exercise 2. 

a. Show that g = x"^ x I is irreducible in F2[a;]. How many elements 
are there in the field F = ¥2[x]/{g)7 

b. Writing a for the element of F represented by x as above, compute all 
the distinct powers of a. 

c. Show that IK = {0, 1, a^, a^®} is a field with four elements contained in 
F. 

d. Is there a field with exactly eight elements contained in F? Are there 
any other subfields? (For the general pattern, see Exercise 10 below.) 

In general we may ask what the possible sizes (numbers of elements) of 
finite fields are. The following proposition gives a necessary condition. 
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(1.2) Proposition. Let¥ be a finite field. Then |Fl = where p is some 
prime number and n>l. 

Proof. Since F is a field, it contains a multiplicative identity, which we 
will denote by 1 as usual. Since F is finite, 1 must have finite additive order: 
say p is the smallest positive integer such that p-l = l+-- + l = 0 
{p summands). The integer p must be prime. (Otherwise, if p = mn with 
m,n > 1, then we would have p • 1 = (m • l)(n • 1) == 0 in F. But since F is 
a field, this would imply m • 1 = 0, orn • 1 = 0, which is not possible by the 
minimality of p.) We leave it to the reader to check that the set of elements 
of the form m • 1, m = 0, 1, . . . ,p — 1 in F is a subfield K isomorphic to 
Fp. See Exercise 9 below. 

The axioms for fields imply that if we consider the addition operation 
on F together with scalar multiplication of elements of F by elements from 
K C F, then F has the structure of a vector space over K. Since F is a finite 
set, it must be finite-dimensional as a vector space over K. Let n be its 
dimension (the number of elements in any basis), and let {ai, . . . , Un} C F 
be any basis. Every element of F can be expressed in exactly one way as a 
linear combination cia\ -!-••• + c^a^, where ci, . . . , G K. There are p^ 
such linear combinations, which concludes the proof. □ 



To construct finite fields, we will always consider quotient rings ¥p[x]/ (g) 
where g is an irreducible polynomial in Fp[x]. There is no loss of generality 
in doing this-every finite field can be obtained this way. See Exercise 11 
below. 

We will show next that for each prime p and each n > 1, there exist 
finite fields of every size p^ by counting the irreducible polynomials of fixed 
degree in Fp[rc]. First note that it is enough to consider monic polynomials, 
since we can always multiply by a constant in Fp to make the leading 
coefficient of a polynomial equal 1. There are exactly p^ distinct monic 

polynomials x'^ 4- an-ix'^~^ H [-aix-i-ao of degree n in Fp[x]. Consider 

the generating function for this enumeration by degree: the power series in 
u in which the coefficient of u^ equals the number of monic polynomials 
of degree n, namely p^. This is the left hand side in (1.3) below. We treat 
this as a purely formal series and disregard questions of convergence. The 
formal geometric series summation formula yields 



(1.3) 



OO 

n=0 



1 

1 — pu 



Each monic polynomial factors uniquely in ¥p[x] into a product of monic 
irreducibles. For each n, let Nn be the number of monic irreducibles of 
degree n in Fp[x]. In factorizations of the form g = gi • g2 ‘ ‘ 9m where the 
gi are irreducible (but not necessarily distinct) of degrees we have N-m 
choices for the factor gi for each i. The total degree of g is the sum of the 
degrees of the factors. 
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Exercise 3. By counting factorizations as above, show that the number 
of monic polynomials of degree n (i.e. can also be expressed as the 
coefBcient of in the formal infinite product 

CX) ^ 

(1 + U + • (1 + n (1 _ yk)Nk ’ 

where the equality between the left- and right-hand sides comes from the 
formal geometric series summation formula. Hint: The term in the product 
with index k accounts for factors of degree k in polynomials. 

Hence combining (1.3) with the result of Exercise 3, we obtain the 
generating function identity 

cxD 1 ^ 

1 — pu 

(1.5) Proposition. We have p^ = Sfc|n where the sum extends 
over all positive divisors k of the integer n. 

Proof. Formally taking logarithmic derivatives and multiplying the re- 
sults by u, from (1.4) we arrive at the identity kNkU^/{l — u^) = 

pu/{l — pu). Using formal geometric series again, this equality can be 
rewritten as 

oo 

kNk{u^ + H ) = pu-\- p^u^ H . 

fc=i 

The proposition follows by comparing the coefficients of on both sides 
of this last equation. □ 

Exercise 4. (For readers with some background in elementary number 
theory.) Use Proposition (1.5) and the Mobius inversion formula to derive 
a general formula for iV^. 

We will show that > 0 for all n > 1. For n = 1, we have Ni = p 
since all x — (3 G Fp are irreducible. Then Proposition (1.5) implies that 

^2 = {p^ - p)/2 > 0, iVa = {p^ - p)/3 > 0, and N 4 = (p^ - p^)/4 > 0. 

Arguing by contradiction, suppose that Nn — 0 for some n. We may 
assume n > 5 by the above. Then from Proposition (1.5), 

(1.6) p” = Y. 

fc|n,0</e<n 

We can estimate the size of the right-hand side and derive a contradiction 
from (1.6) as follows. We write for the greatest integer less than or 
equal to A. Since < p^ for all k (the irreducibles are a subset of the 
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whole collection monic polynomials), and any positive proper divisor k of 
n is at most [n/2j, we have 

Ln/2J 

P" < Ln/2J P"- 

k=0 

Applying the finite geometric sum formula, the right-hand side equals 

Ln/2j(pLV2J+i _ i)/(p_ 1) < [n/2jpL"/2J+i. 

Hence 

p” < 

Dividing each side by pL”/2J ^ we obtain 

pn-Ln/ 2 J < [n/2\p. 

But this is clearly false for all p and all n > 5. Hence Nn > 0 for all n, and 
as a result we have the following fact. 

(1.7) Theorem. For all primes p and all n > 1, there exist finite fields 
F with [Fj = p^. 

From the examples we have seen and from the proof of Theorem (1.7), 
it might appear that there are several different finite fields of a given size, 
since there will usually be more than one irreducible polynomial ^ of a given 
degree in Fp[a;] to use in constructing quotients ¥p[x]/{g). But consider the 
following example. 

Exercise 5. By Proposition (1.5), there are (2^ — 2)/3 = 2 monic ir- 
reducible polynomials of degree 3 in F 2 [x], namely gi = + x + 1, and 

q 2 = x^-fx^-f-1. Hence Ki = F 2 [x]/(^i) andK 2 = F 2 [x]/(^ 2 ) are two finite 
fields with 8 elements. We claim, however, that these fields are isomorphic. 

a. Writing a for the coset of x in Ki (so gi{a) = 0 in Ki), show that 
g 2 {a 4- 1) = 0 in Ki. 

b. Use this observation to derive an isomorphism between Ki and K 2 (that 
is, a one-to-one, onto mapping that preserves sums and products). 

The general pattern is the same. 

(1.8) Theorem. Let Ki and K 2 be two fields with |Ki| = IK 2 I = p^. 
Then Ki and K 2 are isomorphic. 

See Exercise 12 below for one way to prove this. Because of (1.8), it 
makes sense to adopt the uniform notation F^n for any field of order p^^ 
and we will use this convention for the remainder of the chapter. When 
we do computations in ¥pn , however, we will always use an explicit monic 
irreducible polynomial g{x) of degree n as in the examples above. 
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The next general fact we will consider is also visible in (1.1) and in the 
other examples we have encountered. 

(1.9) Theorem. Let¥ = Fpn be a finite field. The multiplicative group of 
nonzero elements of F is cyclic of order p'^ — 1. 

Proof. The statement about the order of the multiplicative group is clear 
since we are omitting the single element 0. Write m = — 1. By Lagrange’s 

Theorem for finite groups ([Her]), every element /3 G F \ {0} is a root of 
the polynomial equation = 1, and the multiplicative order of each is 
a divisor of m. We must show there is some element of order exactly m 
to conclude the proof. Consider the prime factorization of m, say m = 

‘ Let TUi = m/qi. Since the polynomial — 1 has at most mi 

roots in the field F, there is some ft G F such that ^ 1. In Exercise 6 

below, you will show that 7^ = j3^ has multiplicative order exactly 
in F. It follows that the product 7172 • • • 7fc has order m, since the q^^ are 
relatively prime. □ 

Exercise 6. In this exercise you will supply details for the final two claims 
in the proof of Theorem (1.9). 

a. Using the notation from the proof, show that 7^ = (3^ has multi- 

plicative order exactly q^^ in F. (That is, show that 7^" =1, but that 
7^ ^ 1 for all fc = 1, . . . , ql* - 1.) 

b. Let 7i, 72 be elements of a finite abelian group. Suppose that the orders 
of 7i and 72 {n\ and ri2 respectively) are relatively prime. Show that 
the order of the product 7172 is equal to nin2- 

A generator for the multiplicative group of Fpn is called a primitive el- 
ement. In the fields studied in (1.1) and in Exercise 2, the polynomials g 
were chosen so that their roots were primitive elements of the correspond- 
ing finite field. This will not be true for all choices of irreducible 5^ of a 
given degree n in Fp[xj. 

Exercise 7. For instance, consider the polynomial ^ + 1 in ¥s[x]. 

Check that g is irreducible, so that IK = F3[x]/(^) is a field with 9 elements. 
However the coset of x in K is not a primitive element. (Why not? what is 
its multiplicative order?) 

For future reference, we also include the following fact about finite fields. 

Exercise 8. Suppose that (3 G Fpn is neither 0 nor 1. Then show that 
E^=o^ = 0. Hint: What is - l)/(x - 1)? 

To conclude this section, we indicate one direct method for performing 
finite field arithmetic in Maple. Maple provides a built-in facility (via the 
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mod operator) by which polynomial division, row operations on matrices, 
resultant computations, etc. can be done using coefficients in finite fields. 
When we construct a quotient ring ¥p[x]/ {g) the coset of x becomes a root 
of the equation g — Oin the quotient. In Maple, the elements of a finite field 
can be represented as (cosets of) polynomials in RootOf expressions (see 
Chapter 2, §1). For example, to declare the field Fs = ¥ 2 [x]/{x^ -h x + 1), 
we could use 



alias (alpha = RootOf + x + 1)) ; 

Then polynomials in alpha represent elements of the field Fs as before. 
Arithmetic in the finite field can be performed as follows. For instance, 
suppose we want to compute + 6, where b = a 1. Entering 

b : = alpha + 1 ; 

Normal (b'S + b) mod 2; 



yields 

alpha^ H- alpha + 1. 

The Normal function computes the normal form for the element of the 
finite field by expanding out 6^ + 6 as a polynomial in a, then finding the 
remainder on division by + a + 1, using coefficients in F 2 . 

A technical point: You may have noticed that the Normal function name 
here is capitalized. There is also an uncapitalized normal function in Maple 
which can be used for algebraic simplification of expressions. We do not 
want that function here, however, because we want the function call to be 
passed unevaluated to mod, and all the arithmetic to be performed within 
the mod environment. Maple uses capitalized names consistently for un- 
evaluated function calls in this situation. Using the command normal (b'‘3 
+ b) mod 2 would instruct Maple to simplify b^ -h 6, then reduce mod 2, 
which does not yield the correct result in this case. Try it! 



Additional Exercises for §1 

Exercise 9. Verify the claim made in the proof of Proposition (1.2) that 
if F is a field with p^ elements, then F has a subfield 

K={0,l,2.1,...,(p-l).l} 



isomorphic to Fp. 

Exercise 10. Using Theorem (1.9), show that F^n contains a subfield Fpm 
if and only if m is a divisor of n. Hint: By (1.9), the multiplicative group 
of the subfield is a subgroup of the multiplicative cyclic group Fp m \{0}. 
What are the orders of subgroups of a cyclic group of order p^ — 17 
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Exercise 11. In this exercise, you will show that every finite field F can 
be obtained (up to isomorphism) as a quotient / = ¥-p[x]/{g) for some irre- 
ducible g e ¥p[x]. For this exercise, we will need the fundamental theorem 
of ring homomorphisms (see e.g., [CLO] Chapter 5, §2, Exercise 16). Let F 
be a finite field, and say |F| = (see Proposition (1.2)). Let a be a prim- 
itive element for F (see (1.9)). Consider the ring homomorphism defined 
by 



(p : Fp[x] F 
x I— > a. 



a. Explain why p must be onto. 

b. Deduce that the kernel of (p must have the form ker((^) = {g) for some 
irreducible monic polynomial g G k[x]. (The monic generator is called 
the minimal polynomial of a over Fp.) 

c. Apply the fundamental theorem to show that 

F ^ Fp[x]/(5). 

Exercise 12. In this exercise, you will develop one proof of Theorem (1.8), 
using Theorem (1.9) and the previous exercise. Let K and L be two fields 
with p^ elements. Let /? be a primitive element for L, and let g G Fp[a;] be 
the minimal polynomial of /3 over Fp, so that L = Fp[a:]/(^) (Exercise 11). 

a. Show that g must divide the polynomial — a; in Fp[a;]. (Use (1.9).) 

b. Show that — x splits completely into linear factors in K[x]: 

x^ — X = (x — a). 
a£K 

c. Deduce that there is some a G K which is a root of ^ = 0. 

d. From part c, deduce that K is also isomorphic to ¥p[x]/{g). Hence, 
K^L. 

Exercise 13. Find irreducible polynomials g in the appropriate Fp[x], 
such that Fp[x]/(^) = Fpn, and such that a = [x] is a primitive element in 
Fpn for each p^ < 64. (Note: The cases p^ — 8, 9, 16 are considered in the 
text. Extensive tables of such polynomials have been constructed for use 
in coding theory. See for instance [PH].) 

Exercise 14. (The Frobenius Automorphism) Let F^ be a finite field. By 
Exercise 10, F^ C F^m for each m > 1. Consider the mapping F : F^m 
¥qm defined by F{x) = x^. 

a. Show that F is one-to-one and onto, and that F{x + y) = F{x) + F{y) 
and F(xy) = F{x)F{y) for all x,y G F^m. (In other words, F is an 
automorphism of the field F^m.) 

b. Show that F{x) = x if and only if x G Fg C F^m. 
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For readers familiar with Galois theory, we mention that the Probenius 
automorphism F generates the Galois group of F^m over — a cyclic group 

of order m. 



§2 Error-Correcting Codes 

In this section, we will introduce some of the basic standard notions from 
algebraic coding theory. For more complete treatments of the subject, we 
refer the reader to [vLi], [Bla], or [MS]. 

Communication of information often takes place over “noisy” channels 
which can introduce errors in the transmitted message. This is the case 
for instance in satellite transmissions, in the transfer of information within 
computer systems, and in the process of storing information (numeric data, 
music, images, etc.) on tape, on compact disks or other media, and retriev- 
ing it for use at a later time. In these situations, it is desirable to encode 
the information in such a way that errors can be detected and/or corrected 
when they occur. The design of coding schemes, together with efficient 
techniques for encoding and decoding (i.e. recovering the original message 
from its encoded form) is one of the main goals of coding theory. 

In some situations, it might also be desirable to encode information in 
such a way that unauthorized readers of the received message will not be 
able to decode it. The construction of codes for secrecy is the domain of 
cryptography, a related but distinct field that will not be considered here. 
Interestingly enough, ideas from number theory and algebraic geometry 
have assumed a major role there as well. The forthcoming book [Kob] 
includes some applications of computational algebraic geometry in modern 
cryptography. 

In this chapter, we will study one specific type of code, in which all in- 
formation to be encoded consists of strings or words of some fixed length k 
using symbols from a fixed alphabet, and all encoded messages are divided 
into strings called codewords of a fixed block length n, using symbols from 
the same alphabet. In order to detect and/or correct errors, some redun- 
dancy must be introduced in the encoding process, hence we will always 
have n > k. 

Because of the design of most electronic circuitry, it is natural to consider 
a binary alphabet consisting of the two symbols {0, 1}, and to identify the 
alphabet with the finite field F 2 . As in §1, strings of r bits (thought of as 
the coefficients in a polynomial of degree r — 1) can also represent elements 
of a field ¥ 2 ^ , and it will be advantageous in some cases to think of ¥ 2 ^ 
as the alphabet. But the constructions we will present are valid with an 
arbitrary finite field ¥q as the alphabet. 

In mathematical terms, the encoding process for a string from the mes- 
sage will be a one-to-one function £* : F^^ — > FJ^. The image C = 
E{¥q) C FJ^ is referred to as the set of codewords, or more succinctly 
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as the code. Mathematically, the decoding operation can be viewed as a 
function D ^ such that D o E \s the identity on (This is ac- 
tually an over-simplification — most real-world decoding functions will also 
return something like an “error” value in certain situations.) 

In principle, the set of codewords can be an arbitrary subset of 
However, we will almost always restrict our attention to a class of codes 
with additional structure that is very convenient for encoding and decoding. 
This is the class of linear codes. By definition, a linear code is one where 
the set of codewords C forms a vector subspace of of dimension k. 
In this case, as encoding function E : ¥^ we may use a linear 

mapping whose image is the subspace C. The matrix of E with respect to 
the standard bases in the domain and target is often called the generator 
matrix G corresponding to E. 

It is customary in coding theory to write generator matrices for linear 
codes as k X n matrices and to view the strings in F^^ as row vectors w. 
Then the encoding operation can be viewed as matrix multiplication of a 
row vector on the right by the generator matrix, and the rows of G form 
a basis for C. As always in linear algebra, the subspace C may also be 
described as the set of solutions of a system of n — A: independent linear 
equations in n variables. The matrix of coefficients of such a system is 
often called a parity check matrix for C. This name comes from the fact 
that one simple error-detection scheme for binary codes is to require that 
all codewords have an even (or odd) number of nonzero digits. If one bit 
error (in fact, any odd number of errors) is introduced in transmission, that 
fact can be recognized by multiplication of the received word by the parity 
check matrix H = 1 ••• 1)^. The parity check matrix for a linear 

code can be seen as a extension of this idea, in which more sophisticated 
tests for the validity of the received word are performed by multiplication 
by the parity check matrix. 

Exercise 1. Consider the linear code C with n = 4, A: = 2 given by the 
generator matrix 

‘==(1 J J i)- 

a. Show that since we have only the two scalars 0, 1 G F 2 to use in making 
linear combinations, there are exactly four elements of C: 

(0,0)G = (0,0,0,0), (1,0)G = (1,1,1,1), 

(0,1)G= (1,0,1,0), (1,1)G- (0,1,0,1). 

b. Show that 
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1 
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is a parity check matrix for C by verifying that xH = 0 for all x ^ C. 

Exercise 2. Let F4 = ¥ 2 [a]/{a^ + a + 1), and consider the linear code C 
in F4 with generator matrix 

fa 0 a + 1 1 0\ 

V 1 1 a 0 1/ 

How many distinct codewords are there in C? Find them. Also find a parity 
check matrix for C. Hint: Recall from linear algebra that there is a general 
procedure using matrix operations for finding a system of linear equations 
defining a given subspace. 

To study the error-correcting capability of codes, we need a measure 
of how close elements of are, and for this we will use the Hamming 
distance. Let x^y G Then the Hamming distance between x and y is 
defined to be 



d{x, y) = |{i, \ <i <n \ Xii^ yi}\. 

For instance, \i x = (0,0, 1,1,0), and y = (1,0, 1,0, 0) in F2, then 
d(x^ y) = 2 since only the first and fourth bits in x and y differ. 

Exercise 3. Verify that the Hamming distance has all the properties of a 
metric or distance function on F^. (That is, show d{x^ y) > 0 for all x, y 
and d{x, y) = 0 if and only if x = y, the symmetry property d{x, y) = 
d(y, x) holds for all x, y, and the triangle inequality d{x, y) < d{x, z) + 
d{z, y) is valid for all x, y, z.) 

Given x G FJ^, we will denote by Br{x) the closed ball of radius r (in 
the Hamming distance) centered at x: 

Br{x) = {y G : d{y,x) < r}. 

(In other words, Br{x) is the set of y differing from x in at most r 
components.) 

The Hamming distance gives a simple but extremely useful way to mea- 
sure the error-correcting capability of a code. Namely, suppose that every 
pair of distinct codewords x, y in a code C C ¥^ satisfies d{x, y) > d for 
some d> 1. If no more than d — 1 of the components of a valid codeword 
are altered, under this hypothesis the result is never another codeword. As 
a result, it is possible to tell that errors have occurred. Any d — 1 or fewer 
errors in a received word can be detected. 

Moreover if d > 2t -h 1 for some t > 1, then for any z e ¥^, by 
the triangle inequality, d(x, z) -|- d{z^ y) > d(x, y) > 2t + 1. It follows 
immediately that either d(x, z) > t ov d{y, z) > t, so Bt{x) fl Bt{y) = 0. 
As a result the only codeword in Bt{x) is x itself. In other words, if any t 
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or fewer errors are introduced in transmission of a codeword, those errors 
can be corrected by the nearest neighbor decoding function 

D{x) = E~^{c), where c E C minimizes d{x, c) 

(and an “error” value if there is no unique closest element in C). 

From this discussion it is clear that the minimum distance 

d = mm{d{x^ y) : x ^ y e C} 

is an important parameter of codes, and our observations above can be 
summarized in the following way. 

(2.1) Proposition. Let C be a code with minimum distance d. Any d — 1 
errors in a received word can be detected. Moreover, if d > 2t 1, any t 
errors can be corrected by nearest neighbor decoding. 

Since the minimum distance of a code contains so much information, it 
is convenient that for linear codes we need only examine the codewords 
themselves to determine this parameter. 



Exercise 4. Show that for any linear code C the minimum distance d is 
the same as mina.^(7\{o} |{^ • ^ 0}| (the minimum number of nonzero 

entries in a nonzero codeword). Hint: Since the set of codewords is closed 
under vector sums, x — y E C whenever x and y are. 



The Hamming codes form a famous family of examples with interesting 
error-correcting capabilities. One code in the family is a code over F 2 with 
n = 7, k = A. (The others are considered in Exercise 11 below.) For this 
Hamming code, the generator matrix is 
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For example tt; = (1, 1, 0, 1) G F 2 is encoded by multiplication on the right 
by G, yielding E{w) = wG = (1, 1, 0, 1, 0, 0, 1). From the form of the first 
four columns of G, the first four components of E{w) will always consist of 
the four components of w itself. 

The reader should check that the 7x3 matrix 



(2.3) 
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has rank 3 and satisfies GH = 0. Hence H is a. parity check matrix for 
the Hamming code (why?). It is easy to check directly that each of the 
15 nonzero codewords of the Hamming code contains at least 3 nonzero 
components. This implies that d{x^ y) is at least 3 when x ^ y. Hence the 
minimum distance of the Hamming code is d = 3, since there are exactly 
three nonzero entries in row 1 of G for example. By Proposition (2.1), any 
pair of errors in a received word can be detected, and any single error in a 
received word can be corrected by nearest neighbor decoding. The following 
exercise gives another interesting property of the Hamming code. 

Exercise 5. Show that the balls of radius 1 centered at each of the words 
of the Hamming code are pairwise disjoint, and cover ¥2 completely. (A 
code C with minimum distance d = 2t 1 is called a perfect code if the 
union of the balls of radius t centered at the codewords equals FJ^.) 

Generalizing a property of the generator matrix (2.2) noted above, encod- 
ing functions with the property that the symbols of the input word appear 
unchanged in some components of the codeword are known as systematic 
encoders. It is customary to call those components of the codewords the in- 
formation positions. The remaining components of the codewords are called 
parity checks. Systematic encoders are sometimes desirable from a practical 
point of view because the information positions can be copied directly from 
the word to be encoded; only the parity checks need to be computed. There 
are corresponding savings in the decoding operation as well. If information 
is systematically encoded and no errors occur in transmission, the words 
in the message can be obtained directly from the received words by simply 
removing the parity checks. (It is perhaps worthwhile to mention again at 
this point that the goal of the encoding schemes we are considering here is 
reliability of information transmission, not secrecy!) 

Exercise 6. Suppose that the generator matrix for a linear code C has 
the systematic form G = {Ik \ P), where Ik is a k x k identity matrix, and 
P is some k x {n — k) matrix. Show that 

is a parity-check matrix for G. 

We will refer to a linear code with block length n, dimension fc, and 
minimum distance d as an [n, fc, d] code. For instance, the Hamming code 
given by the generator matrix (2.2) is a [7,4,3] code. 

Determining which triples of parameters [n, fc, d] can be realized by codes 
over a given finite field and constructing such codes are two important 
problems in coding theory. These questions are directly motivated by the 
decisions an engineer would need to make in selecting a code for a given 
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application. Since an [n,k,d] code has distinct codewords, the choice of 
the parameter k will be determined by the size of the collection of words 
appearing in the messages to be transmitted. Based on the characteristics 
of the channel over which the transmission takes place (in particular the 
probability that an error occurs in transmission of a symbol), a value of 
d would be chosen to ensure that the probability of receiving a word that 
could not be correctly decoded was acceptably small. The remaining ques- 
tion would be how big to take n to ensure that a code with the desired 
parameters k and d actually exists. It is easy to see that, fixing fc, we can 
construct codes with d as large as we like by taking n very large. (For 
instance, our codewords could consist of many concatenated copies of the 
corresponding words in F^^.) However, the resulting codes would usually be 
too redundant to be practically useful. “Good” codes are ones for which 
the information rate R = fc/n is not too small, but for which d is relatively 
large. There is a famous result known as Shannon’s Theorem (for the pre- 
cise statement see, e.g., [vLi]) that ensures the existence of “good” codes in 
this sense, but the actual construction of “good” codes is one of the main 
problems in coding theory. 

Exercise 7. In the following exercises, we explore some theoretical results 
giving various bounds on the parameters of codes. One way to try to pro- 
duce good codes is to fix a block-length n and a minimum distance d, then 
attempt to maximize k by choosing the codewords one by one so as to keep 
d(x, y) > d for all distinct pairs x ^ y. 

a. Show that h = \Bd-i{c)\ is given by b = Yli=o (?)(^ “ 
c G 

b. Let d be a given positive integer, and let C be a subset C C (not 
necessarily a linear code) such that d(x, y) > d for all pairs x ^ y m. 
C. Assume that for all z G \ C, d(z, c) < d — 1 for some c E C. 
Then show that b • \C\ > (b as in part a). This result gives one form 
of the Gilbert- Varshamov bound. Hint: an equivalent statement is that 
if 6 • \C\ < q^, then there exists some z such that every pair of distinct 
elements in C U {z} is still separated by at least d. 

c. Show that if k satisfies b < then an [n. A:, d] linear code exists. 

Hint: By induction, we may assume that an [n. A; — 1, d] linear code C 
exists. Using part b, consider the linear code C" spanned by C and z, 
where the distance from z to any word in C is > d. Show that C" still 
has minimum distance d. 

d. On the other hand, show that for any linear code d < n — k This 
result is known as the Singleton bound. Hint: consider what happens 
when a subset of d — 1 components is deleted from each of the codewords. 

Many other theoretical results, including both upper and lower bounds 
on the n, k, d parameters of codes, are also known. See the coding theory 
texts mentioned at the start of this section. 
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We now turn to the encoding and decoding operations. Our first obser- 
vation is that encoding is much simpler to perform for linear codes than 
for arbitrary codes. For a completely arbitrary C of size there would 
be little alternative to using some form of table look-up to compute the 
encoding function. On the other hand, for a linear code all the information 
about the code necessary for encoding is contained in the generator matrix 
(only k basis vectors for C rather than the whole set of codewords), 
and all operations necessary for encoding may be performed using linear 
algebra. 

Decoding a linear code is also correspondingly simpler. A general method, 
known as syndrome decoding^ is based on the following observation. If c — 
wG is a codeword, and some errors e G are introduced on transmission 
of c, the received word will be re = c -|- e. Then cH — 0 implies that 
xH = {c-\-e)H = cH-\-eH = O-heff = eH. Hence xH depends only on the 
error. The possible values for eH G are known as syndromes^ and it is 

easy to see that the these syndromes are in one-to-one correspondence with 
the cosets of C in ¥^ (or elements of the quotient space ¥q/C = 
so there are exactly q'^~^ of them. (See Exercise 12 below.) 

Syndrome decoding works as follows. First, a preliminary calculation is 
performed, before any decoding. We construct a table, indexed by the pos- 
sible values of the syndrome s = xJT, of the element(s) in the corresponding 
coset with the smallest number of nonzero entries. These special elements 
of the cosets of C are called the coset leaders. 

Exercise 8. Say d = 2^ + 1, so we know that any t or fewer errors can be 
corrected. Show that if there are any elements of a coset of C which have 
t or fewer nonzero entries, then there is only one such element, and as a 
result the coset leader is unique. 

If a; G is received, we first compute the syndrome s = xH and 
look up the coset leader(s) i corresponding to s in our table. If there is 
a unique leader, we replace x by x' — x — i., which is in C (why?). (If 
s = 0, then ^ = 0, and x' = x is itself a codeword.) Otherwise, we report 
an “error” value. By Exercise 8, if no more than t errors occurred in x, 
then we have found the unique codeword closest to the received word x 
and we return E“^(x'). Note that by this method we have accomplished 
nearest-neighbor decoding without computing d(x, c) for all q^ codewords. 
However, a potentially large collection of information must be maintained 
to carry out this procedure — the table of coset leader(s) for each of the 
qu-k cosets of C. In cases of practical interest, n — k and q can be large, 
so q^~^ can be huge. 

Exercise 9. Compute the table of coset leaders for the [7,4,3] Ham- 
ming code from (2.2). Use syndrome decoding to decode the received word 
( 1 , 10 , 1 . 1 , 1 , 0 ): 
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Here is another example of a linear code, this time over the field F4 = 
F2[a]/(a^ + a + 1). Consider the code C with n = S, k = 3 over F4 defined 
by the generator matrix: 

( 111111 1 1 \ 

0 0 1 1 a a \ , 

0 1 a a a J 

Note that G does not have the systematic form we saw above for the 
Hamming code’s generator matrix. Though this is not an impediment to 
encoding, we can also obtain a systematic generator matrix for C by row- 
reduction (Gauss- Jordan elimination). This corresponds to changing basis 
in C] the image of the encoding map E" : F4 — > F| is not changed. It is 
a good exercise in finite field arithmetic to perform this computation by 
hand. It can also be done in Maple as follows. For simplicity, we will write 
a for a within Maple. To work in F4 we begin by defining a as a root of 
the polynomial -f a; -h 1 as above. 

alias (a=Root0f(x''2+x+D) : 

The generator matrix G is entered as 

m :=array(1..3, 1..8, [[l, 1, 1, 1, 1, 1, 1, 1], 

[0, 0, 1, 1, a, a, a"2, a'2], [0, 1, a, a"2, a, a"2, a, a'2]]) : 

Then the command 



mr := Gaussjord(m) mod 2; 

will perform Gauss-Jordan elimination with coefficients treated as elements 
of F 4 . (Recall Maple’s capitalization convention for unevaluated function 
calls, discussed in §1.) The result should be 

( 10 0 1 a a-hl 1 0 \ 

01011 0 a-hl a . 

00 11 a a a-hla+iy 

Note that is replaced by its reduced form a-hl everywhere here. 

In the reduced matrix, the second row has five nonzero entries. Hence 
the minimum distance d for this code is < 5. It is often quite diflGicult 
to determine the exact minimum distance of a code (especially when the 
number of nonzero codewords, — 1, is large). In §5 of this chapter, we 
will return to this example and determine the exact minimum distance. 

To conclude this section, we will develop a relationship between the min- 
imum distance of a linear code and the form of parity check matrices for 
the code. 

(2.5) Proposition. Let C be a linear code with parity check matrix H. If 
no collection of 6 — 1 distinct rows of H is a linearly dependent subset of 
then the minimum distance d of C satisfies d > 6. 
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Proof. We use the result of Exercise 4. Let x G C be a nonzero codeword. 
From the equation xH = 0 in we see that the components of x are 

the coefficients in a linear combination of the rows of H summing to the 
zero vector. If no collection of 6 — 1 distinct rows is linearly dependent, 
then X must have at least 6 nonzero entries. Hence d> 8. □ 

Additional Exercises for §2 

Exercise 10. Consider the formal inner product on defined by 

n 

y) = XI 

2=1 

(a bilinear mapping from x to F^; there is no notion of positive- 
definiteness in this context). Given a linear code (7, let 

= {x e F^ : (x, y) = 0 for all y G C}, 

the subspace of F^ orthogonal to C. If C is fc-dimensional, then is a 
linear code of block length n and dimension n — k known as the dual code 
of C. 

a. Let G = {Ik \ P) be a systematic generator matrix for C. Determine a 
generator matrix for How is this related to the parity check matrix 
for (7? (Note on terminology: Many coding theory texts define a parity 
check matrix for a linear code to be the transpose of what we are calling 
a parity check matrix. This is done so that the rows of a parity check 
matrix will form a basis for the dual code.) 

b. Find generator matrices and determine the parameters [n, k, d] for the 
duals of the Hamming code from (2.2), and the code from (2.4). 

Exercise 11. (The Hamming codes) Let g be a prime power, and let 
m > 1. We will call a set S of vectors in F^ a maximal pairwise linearly 
independent subset of F^ if S has the property that no two distinct el- 
ements of S are scalar multiples of each other, and if S is maximal with 
respect to inclusion. For each pair (g, m) we can construct linear codes C 
by taking a parity check matrix H G Mnxm(Fg) whose rows form a max- 
imal pairwise linearly independent subset of FJ^, and letting (7 C FJ^ be 
the set of solutions of the system of linear equations xH = 0. For instance, 
with g = 2, we can take the rows of H to be all the nonzero vectors in F 2 
(in any order) — see (2.3) for the case g = 2, fc = 3. The codes with these 
parity check matrices are called the Hamming codes. 

a. Show that if S is a maximal pairwise linearly independent subset of FJ^, 
then S has exactly {q'^ - l)/{q — 1) elements. (This is the same as the 
number of points of the projective space F'^~^ over F^.) 

b. What is the dimension fc of a Hamming code defined by an n x m matrix 
HI 
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c. Write down a parity check matrix for a Hamming code with g = 3, 
k = 2. 

d. Show that the minimum distance of a Hamming code is always 3, 
and discuss the error-detecting and error-correcting capabilities of these 
codes. 

e. Show that all the Hamming codes are perfect codes (see Exercise 5 
above). 

Exercise 12. Let C be an [n, fc, d] linear code with parity check matrix 
H, Show that the possible values for yH G (the syndromes) are 

in one-to-one correspondence with the cosets of C in (or elements of 
the quotient space W^/C = ¥q~^). Deduce that there are q'^~^ different 
syndrome values. 



§3 Cyclic Codes 

In this section, we will consider several classes of linear codes with even 
more structure, and we will see how some of the algorithmic techniques in 
symbolic algebra we have developed can be applied to encode them. First 
we will consider the class of cyclic codes. Cyclic codes may be defined in 
several ways — the most elementary is certainly the following: A cyclic code 
is a linear code with the property that the set of codewords is closed under 
cyclic permutations of the components of vectors in Here is a simple 
example. 

In ¥ 2 , consider the [4, 2, 2] code C with generator matrix 

(“) “"(l J i J) 

from Exercise 1 in §2. As we saw there, C contains 4 distinct codewords. 
The codewords (0, 0, 0, 0) and (1, 1, 1, 1) are themselves invariant under all 
cyclic permutations. The codeword (1, 0, 1, 0) is not itself invariant: shifting 
one place to the left (or right) we obtain (0, 1, 0, 1). But this is another 
codeword: (0, 1, 0, 1) = (1, l)G G C. Similarly, shifting (0, 1, 0, 1) one place 
to the left or right, we obtain the codeword (1, 0, 1, 0) again. It follows that 
the set C is closed under all cyclic shifts. 

The property of invariance under cyclic permutations of the components 
has an interesting algebraic interpretation. Using the standard isomorphism 
between FJ^ and the vector space of polynomials of degree at most n — 1 
with coefficients in F^: 

(ao, ai, . . . , CLn-i) ^ Uo + ^ 1 ^ + • • • ^ 

we may identify a cyclic code C with the corresponding collection of polyno- 
mials of degree n — \. The right cyclic shift which sends (ao, fli, . . . , dn-i) to 




§3. Cyclic Codes 425 



(ttn-i, ao, ai, . . . , an- 2 ) is the same as the result of multiplying the poly- 
nomial ao + a\x + • • • + an-ix'^~^ by then taking the remainder on 
division by — 1. 

Exercise 1. Show that multiplying the polynomial p{x) = ao + aix -1- 
• • • + an-ix'^~^ by X, then taking the remainder on division by — 1 
yields a polynomial whose coefficients are the same as those of p{x), but 
cyclically shifted one place to the right. 

This suggests that when dealing with cyclic codes we should consider the 
polynomials of degree at most n — 1 as the elements of the quotient ring 
R = Fg[x]/ {x'^ — 1). The reason is that multiplication of f(x) by x followed 
by division gives the standard representative for the product xf{x) in R. 
Hence, from now on we will consider cyclic codes as vector subspaces of the 
ring R which are closed under multiplication by the coset of x in R. Now 
we make a key observation. 

Exercise 2. Show that if a vector subspace C C R is closed under mul- 
tiplication by [x], then it is closed under multiplication by every coset 
[h{x)] e R. 

Exercise 2 shows that cyclic codes have the defining property of ideals 
in a ring. We record this fact in the following proposition. 

(3.2) Proposition. Let R = ¥q[x]/{x'^ — 1). A vector subspace C C R is 
a cyclic code if and only if C is an ideal in the ring R. 

The ring R shares a nice property with its “parent” ring ¥q[x]. 

(3.3) Proposition. Each nonzero ideal I C R is principal, generated by 
the coset of a single polynomial g of degree n — 1 or less. Moreover, g is a 
divisor of x^ — 1 in [x] . 

Proof. By the standard characterization of ideals in a quotient ring (see 
e.g. [CLO] Chapter 5, §2, Proposition 10), the ideals in R are in one-to-one 
correspondence with the ideals in Fg[x] containing ~ !)• Let J be the 
ideal corresponding to I. Since all ideals in ¥q [x] are principal, J must be 
generated by some g{x). Since — 1 is in J, ^(x) is a divisor of x^ — 1 in 
¥q[x]. The ideal I = J/(x^ — 1) is generated by the coset of ^(x) in R. □ 

Naturally enough, the polynomial g in Proposition (3.3) is called a 
generator polynomial for the cyclic code. 

Exercise 3. Identifying the 4-tuple (a, b, c, d) G F 2 ^ with [a -f- 6x -f cx^ + 
dx^] G i? = F 2 [x]/(x^ — 1), show that the cyclic code in F 2 ^ with generator 
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matrix ( 3 . 1 ) can be viewed as the ideal generated by the coset oi g = 
in R. Find the codewords of the cyclic code with generator 1 x in R. 

The Reed-Solomon codes are one particularly interesting class of cyclic 
codes used extensively in applications. For example, a clever combination 
of two of these codes is used for error control in playback of sound record- 
ings in the Compact Disc audio system developed by Philips in the early 
1980’s. They are attractive because they have good burst error correcting 
capabilities (see Exercise 15 below) and also because efficient decoding al- 
gorithms are available for them (see the next section). We will begin with 
a description of these codes via generator matrices, then show that they 
have the invariance property under cyclic shifts. 

Choose a finite field and consider codes of block length n = q — 1 
constructed in the following way. Let a be a primitive element for ¥q (see 
Theorem (1.9) of this chapter), fix A: < g, and let Lk-i = 

^ be the vector space of polynomials of degree at most A: — 1 < q—1 
in ¥q[t]. We make words in by evaluating polynomials in Lk-i at the 
q—1 nonzero elements of F^. By definition 

(3.4) C = {(/(I), f{a), fia'^-^)) G : / G Lk-i} 

is a Reed-Solomon code, sometimes denoted by RS{k,q). C is a vector 
subspace of F^“^ since it is the image of the vector space Lk-i under the 
linear evaluation mapping 

Generator matrices for Reed-Solomon codes can be obtained by taking 
any basis of Lk-i and evaluating to form the corresponding codewords. The 
monomial basis {1, t, . . . , t^~^} is the simplest. For example, consider 
the Reed-Solomon code over Fg with k = S. Using the basis {1,^, for 
L 3 , we obtain the generator matrix 

/I 1 1 1 1 1 1 1 \ 

(3.5) G = j 1 a \ ^ 

Y 1 a® 1 ) 

where the first row gives the values of f{t) = 1 , the second row gives the 
values of f{t) = t, and the third gives the values of f{t) = t^ at the nonzero 
elements of Fg (recall, = 1 in Fg). For all k < q, the first k columns 
of the generator matrix corresponding to the monomial basis of Lk-i give 
a submatrix of Vandermonde form with nonzero determinant. It follows 
that the evaluation mapping is one-to-one, and the corresponding Reed- 
Solomon code is a linear code with block length n = q — I, and dimension 
k = dim Lk~i- 

The generator matrix formed using the monomial basis of Lk-i also 
brings the cyclic nature of Reed-Solomon codes into sharp focus. Observe 
that each cyclic shift of a row of the matrix G in (3.5) yields a scalar 
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multiple of the same row. For example, cyclically shifting the third row one 
space to the right, we obtain 



(of^, 1, a^, a^, a®, 1, a^, a^) = • (1, a^, 1, a^, a^). 



Exercise 4. Show that the other rows of (3.5) also have the property that 
a cyclic shift takes the row to a scalar multiple of the same row. Show that 
this observation implies this Reed-Solomon code is cyclic. Then generalize 
your arguments to all Reed-Solomon codes. Hint: Use the original definition 
of cyclic codes — closure under all cyclic shifts. You may wish to begin by 
showing that the cyclic shifts are linear mappings onW^. 

We will give another proof that Reed-Solomon codes are cyclic below, 
and also indicate how to find the generator polynomial. However, we pause 
at this point to note one of the other interesting properties of Reed-Solomon 
codes. Since no polynomial in Lk-i can have more than A: — 1 zeroes in 
Fg, every codeword in C has at least {q — 1) — {k — 1) = q — k nonzero 
components (and some have exactly this many). By Exercise 4 of §2, the 
minimum distance for a Reed-Solomon code is d = q — k — n — k 1. 
Comparing this with the Singleton bound from part d of Exercise 7 from §2, 
we see that Reed-Solomon codes have the maximum possible d for the block 
length ^ — 1 and dimension k. Codes with this property are called MDS 
(‘^maximum distance separable”) codes in the literature. So Reed-Solomon 
codes are good in this sense. However, their fixed, small block length relative 
to the size of the alphabet is sometimes a disadvantage. There is a larger 
class of cyclic codes known as BCH codes which contain the Reed-Solomon 
codes as a special case, but which do not have this limitation. Moreover, 
a reasonably simple lower bound on d is known for all BCH codes. See 
Exercise 13 below and [MS] or [vLi] for more on BCH codes. 

Next, we will see another way to show that Reed-Solomon codes are 
cyclic that involves somewhat more machinery, but sheds additional light 
on the structure of cyclic codes of block length q—lin general. Recall from 
Proposition (3.3) that the generator polynomial of a cyclic code of block 
length g — 1 is a divisor of — 1. By Lagrange’s Theorem, each of the 
q — 1 nonzero elements of is a root of — 1 = 0, hence 

- 1 = (x - /3) 

in Fg[a;], where F* is the set of nonzero elements of F^. Consequently, the 
divisors of x^~^ — 1 are precisely the polynomials of the form Yl/sesi^ ~ 
for subsets S' C F*. This is the basis for another characterization of cyclic 
codes. 



Exercise 5. Show that a linear code of dimension k in the quotient ring 
R = ¥q[x]/ {x^~^ — 1) is cyclic if and only if the codewords, viewed as 
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polynomials of degree at most ^ — 2, have some set S of q — k — 1 common 
zeroes in F^*. Hint: If the codewords have the elements in S' as common 
zeroes, then each codeword is divisible by g{x) = ~ 



Using this exercise, we will now determine the generator polynomial of a 
Reed-Solomon code. Let f{t) = YljZo element of Lk-i> Consider 

the values Ci — f{a^) for i = 0, . . . , g — 2. Using the Ci as the coefficients 
of a polynomial as in the discussion leading up to Proposition (3.2), write 
the corresponding codeword as c{x) = then substituting 

for Ci and interchanging the order of summation, we obtain 

c{a^) = ^ 

j=0 \i=0 / 



Assume that 1 <i<q — k — 1. Then for all 0 < j < fc — 1, we have 
1 < 2. By the result of Exercise 8 of § 1 , each of the inner sums on 

the right is zero so c{a^) = 0. Using Exercise 5, we have obtained another 
proof of the fact that Reed-Solomon codes are cyclic, since the codewords 
have the set of common zeroes S — {o;,a^, . , . Moreover, we 

have the following result. 



(3.7) Proposition. Let C he the Reed-Solomon code of dimension k and 
minimum distance d = q — k over Fg . Then the generator polynomial of C 
has the form 

g = {x — a) ' ' ‘ {x — = {x — a) • • • (x — 

For example, the Reed-Solomon codewords corresponding to the three 
rows of the matrix G in (3.5) above are a = 1 -f x -f- + • • • 

b = l-\-ax-{-a‘^x^ -h' • •4-o^a:^, and c = l + + • •-\-a^x'^. 

Using Exercise 8 of §1, it is not difficult to see that the common roots of 
a{x) = b{x) = c{x) = 0 in Fg are x = a, . . . , so the generator 
polynomial for this code is 

g = {x — a)(x — a^){x — a^)(x — a^){x — a^). 

Also see Exercise 11 below for another point of view on Reed-Solomon and 
related codes. 

Prom the result of Proposition (3.2), it is natural to consider the following 
generalization of the cyclic codes described above. Let i2 be a quotient ring 
of Fg[xi, . . . , Xm] of the form 

R = ¥,[xu...,Xm]/{xT - xl- -1) 
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for some ni, . . . , rim- Any ideal / in ii will be a linear code closed under 
products by arbitrary h(x \^ . . . , Xm) in R- We will call any code obtained 
in this way an m-dimensional cyclic code. 

Note first that H = — 1, , x'^ — 1} is a Grobner basis for the ideal 

it generates, with respect to all monomial orders. (This follows for instance 
from Theorem 3 and Proposition 4 of Chapter 2, §9 of [CLO].) Hence 
standard representatives for elements of R can be computed by applying 
the division algorithm in Fq[xi, . . . and computing remainders with 
respect to H. We obtain in this way as representatives of elements of R all 
polynomials whose degree in Xi is — 1 or less for each i. 

Exercise 6. Show that as a vector space, 

R = F,[xi, . . .,Xm]/{x^^ - 1, . . . - 1) = 

Multiplication of an element of R by xi, for example, can be viewed as 
a sort of cyclic shift in one of the variables. Namely, writing a codeword 
c(xi, . . . , Xm) G / as a polynomial in xi, whose coefficients are polynomials 
in the other variables: c = • • • > multiplication by xi, 

followed by division by H yields the standard representative X\C = Cm-i + 

CqXi + ciXi H h Cni-2^1 Since eel this shifted polynomial is also 

a codeword. The same is true for each of the other variables 0:2 , ... , Xm- 

In the case m = 2, for instance, it is customary to think of the codewords 
of a 2-dimensional cyclic code either as polynomials in two variables, or as 
matrices of coefficients. In the matrix interpretation, multiplication by 
then corresponds to the right cyclic shift on each row, while multiplication 
by X 2 corresponds to a cyclic shift on each of the columns. Each of these 
operations leaves the set of codewords invariant. 

Exercise 7. WritingF4 = F2[o']/(a^-f a + l), the ideal / C ¥ 4 [x,y]/{x^ — 
1, y^ - 1) generated by gi{x, y) = x‘^ a^xy + ay, g 2 {x, y) = y 1 gives 
an example of a 2-dimensional cyclic code with n = 3^ == 9. As an exercise, 
determine k, the vector space dimension of this 2-dimensional cyclic code, 
by determining a vector space basis for / over F4. (Answer: k = 7. Also 
see the discussion following Theorem (3.9) below.) The minimum distance 
of this code is d = 2. Do you see why? 

To define an m-dimensional cyclic code, it suffices to give a set of gener- 
ators {[/i], • • • , [/s]} C i? for the ideal I C R. The corresponding ideal J 
in ¥q[xi , . . . ,Xm] is 

J={fu...Js)-b{x^^ -1). 

Fix any monomial order on ¥q[xi, . . . ,Xm]- With a Grobner basis G = 
{gi, gt} for J with respect to this order we have everything necessary to 
determine whether a given element of i? is in 7 using the division algorithm 
in F,[xi, . . .,Xm\- 
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(3.8) Proposition. Let R, /, J, G be as above. A polynomial h{xi , . . . , Xm) 
represents an element of I in R if and only if its remainder on division by 
G is zero. 

Proof. This follows because I = ^ - 1, , xl^ - 1) and standard 

isomorphism theorems (see Theorem 2.6 of [Jac]) give a ring isomorphism 

R/I = Fq[a:i, . . . 

See Exercise 14 below for the details. □ 

An immediate consequence of Proposition (3.8) is the following system- 
atic encoding algorithm for m-dimensional cyclic codes using division with 
respect to a Grobner basis. One of the advantages of m-dimensional cyclic 
codes over linear codes in general is that their extra structure allows a very 
compact representation of the encoding function. We only need to know 
a reduced Grobner basis for the ideal J corresponding to a cyclic code to 
perform systematic encoding. A Grobner basis will generally have fewer 
elements than a vector space basis of I. This frequently means that much 
less information needs to be stored. In the following description of a sys- 
tematic encoder, the information positions of a codeword will refer to the k 
positions in the codeword that duplicate the components of the element of 
that is being encoded. These will correspond to a certain subset of the 
coefficients in a polynomial representative for an element of R. Similarly, 
the parity check positions are the complementary collection of coefficients. 

(3.9) Theorem. Let I C. R = ¥q[xi , . . . , Xm]/{xi^ — 1? • • • ) — 1) 

be an m-dimensional cyclic code, and let G be a Grobner basis for the 
corresponding ideal J C ¥q[xi, . . . , Xm] respect to some monomial 
order. Then there is a systematic encoding function for I constructed as 
follows. 

a. The information positions are the coefficients of the nonstandard mono- 
mials for J in which each Xi appears to a power at most rii — 1. 
(Non-standard monomials are monomials in {lt{J)).) 

b. The parity check positions are the coefficients of the standard monomi- 
als. (The standard monomials are those not contained in (lt(J)).^ 

c. The following algorithm gives a systematic encoder E for I: 

Input: the Grobner basis G for J, 

w, a linear combination of nonstandard monomials 
Output: E{w) G I 

Uses: Division algorithm with respect to given order 
w := (the remainder on division) 

E{w) := w — w 
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Proof. The dimension of R/I sls a vector space over ¥q is equal to the 
number of standard monomials for J since R/I = Fg[xi, . . . , Xm]/J- (See 
for instance Proposition 4 from Chapter 5, §3 of [CLO].) The dimension of I 
as a vector space over is equal to the difference dim dim R/I. But this 
is the same as the number of nonstandard monomials for J, in which each 
Xi appears to a power at most rii — 1. Hence the span of those monomials is 
a subspace of R of the same dimension as I. Let it; be a linear combination 
of only these nonstandard monomials. By the properties of the division 
algorithm, uJ is a linear combination of only standard monomials, so the 
symbols from w are not changed in the process of computing E{w) = w — w. 
By Proposition (3.8), the difference it; — u; is an element of the ideal /, so 
it represents a codeword. As a result E is a. systematic encoding function 
for I. □ 

In the case m = 1, the Grobner basis for J is the generator polynomial 
g, and the remainder w is computed by ordinary 1- variable polynomial 
division. For example, let Fg = F 3 [a]/(o'^ H- a + 2) (a is a primitive ele- 
ment by (1.1)) and consider the Reed-Solomon code over Fg with n = 8, 
A; = 5. By Proposition (3.7), the generator polynomial for this code is 
g = {x — a){x — a^){x — and {^} is a Grobner basis for the ideal J in 
Fg[x] corresponding to the Reed-Solomon code. By Theorem (3.9), as in- 
formation positions for a systematic encoder we can take the coefficients of 
the nonstandard monomials x^,x^, ... ,x^ in an element of ¥g[x]/{x^ ~ !)• 
The parity check positions are the coefficients of the standard monomials 
x^^ X, 1. To encode a word w{x) = + ax^ 4- (a -f l)x^, for instance, we 

divide g into it;, obtaining the remainder w. Then E{w) = w — w. Here 
is a Maple session performing this computation. We use the method dis- 
cussed in §§1,2 for dealing with polynomials with coefficients in a finite 
field. First we find the generator polynomial for the Reed-Solomon code as 
above, using: 



alias (alpha = RootOf (t"2 + t + 2)) ; 
g := collect (Expand ((x“alpha)*(x--alpha" 2)* 
(x-alpha"3) mod 3,x) ; 

This produces output 

^ x^ -I- alpha x^ (1 -f alpha)x + 2 alpha + 1. 



Then 



w := x"7 + alpha*x"5 + (alpha + l)*x"3: 

(w - Rem(w,g,x)) mod 3; 

yields output as follows 

x^ + alpha x^ -h (1 + alpha)x^ H- 2(2 + 2 alpha)x^ 4- x + 2. 
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After simplifying the coefficient of to a + 1, this is the Reed-Solomon 
codeword. 

Next, we consider the 2-dimensional cyclic code in Exercise 7. Recall 
I C R = ¥ 4 [x,y]/{x^ — 1,2/^ — 1) generated by gi{x,y) = + a^xy H- 

2/j 92 {x, y) — y -{-1. Take F4 = F2[o;]/(a^ 4- a + 1) and note that -1 = +1 
in this field. Hence x^ — 1 the same as x^ H- 1, and so forth. As above, we 
must consider the corresponding ideal 

J = {x^ + a^xy + ay,y + 1, x^ + l,y^ + 1) 

in F4 [x, 2/]. Applying Buchberger’s algorithm to compute a reduced lex 
Grobner basis (x > y) for this ideal, we find 

G = {x^ + a^x + a, 2/ 4- 1}. 

As an immediate result, the quotient ring ¥ 4 [x,y]/J = R/I is 2- 
dimensional, while R is 9-dimensional over F 4 . Hence I has dimension 
9 — 2 = 7. There are also exactly two points in V(J). According to The- 
orem (3.9), the information positions for this code are the coefficients of 
x^, 2/, X2/, x^2/j 2/^j ^2/^7 ^^2/^7 the parity checks are the coefficients of 
l,x. To encode w = x^2/^ for example, we would compute the remain- 

der on division by G, which is x^2/^ = ^ then subtract to obtain 

E{w) = x^2/^ 4- a^x + a. 

Unfortunately, the grobner package supplied with Maple does not at 
present support finite field coefficient arithmetic. (The Domains package 
distributed with Release 4 of Maple V does contain the groundwork for ex- 
tending the Grobner basis routines to polynomials over finite fields, so this 
feature may appear in future versions.) Other computer algebra systems 
such as Axiom, Singular, and Macaulay can handle these computations. 



Additional Exercises for §3 

Exercise 8. Let G be a cyclic code in R — ¥q[x]/{x'^ — 1), with monic 
generator polynomial g{x) of degree n — so that the dimension of G 
is k. Write out a generator matrix for G as a linear code, viewing the 
encoding procedure of Theorem (3.9) as a linear map from the span of 
. . . , x'^~^} to R. In particular show that every row of the 
matrix is determined by the first row, i.e. the image E{x'^~^). This gives 
another way to understand how the cyclic property reduces the amount of 
information necessary to describe a code. 



Exercise 9. This exercise will study the dual of a cyclic code of block 
length q — 1 or (q — 1)"^ more generally. See Exercise 10 from §2 for the 
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definition of the dual of a linear code. Let R = ¥q[x\/ — 1) as in the 

discussion of Reed-Solomon codes. 

a. Show that if f{x) = ~ Z]i=o represent 

any two elements of R, then the inner product (a, h) of their vec- 
tors of coefficients is the same as the constant term in the product 
f{x)h{x~^) = f{x)h{x^~‘^) in R. 

b. Let C be a cyclic code in R. Show that the dual code is equal to 
the collection of polynomials h{x) such that f{x)h{x~^) — 0 (product 
in R) for all f{x) G C. 

c. Use part b to describe the generator polynomial for in terms of the 

generator g{x) for C. Hint: recall from the proof of Proposition (3.3) 
that g{x) is a divisor of x^~^ — 1 = “ P)- generator 

polynomial for will have the same property. 

d. Extend these results to m-dimensional cyclic codes in 

• • • ) ^m]/ {^i 1:2 = 1 ,..., TTl) . 



Exercise 10. This exercise discusses another approach to the study of 
cyclic codes of block-length ^ — 1, which recovers the result of Exercise 5 
in a different way. Namely, consider the ring R = ¥q[x]/ {x^~^ — 1). The 
structure of the ring R and its ideals may be studied as follows, 

a. Show that 



(3.10) 



c(x) 1 -^ (c(l), c(a), . . . , c(a«“^)) 



defines a bijective mapping, which becomes an isomorphism of rings if 
we introduce the component-wise product 



(co, . . . , Cq- 2 ) • (do? • • • ) dq- 2 ) = (codo, • . . , Cq- 2 dq- 2 ) 

as multiplication operation in (The mapping is a discrete ana- 

logue of the Fourier transform since it takes polynomial products in 
R — convolution on the coefficients — to the component-wise products in 

b. Show that the ideals in the ring ^ (with the component-wise prod- 
uct) are precisely the subsets of the following form. For each collection 
of subscripts S C {0, 1, . . . , g — 2}, let 

= {(co, • • • , Cq- 2 ) : Ci = 0 for all i G S}. 

Then each ideal is equal to Is for some S. 

c. Using the mapping ip, deduce from part b and Proposition (3.2) that 
cyclic codes in R are in one-to-one correspondence with subsets S C 
{0, 1, . . . , g — 2}, or equivalently subsets of the nonzero elements of the 
field, ¥q. Given a cyclic code C C R^ the corresponding subset of F * is 
called the set of zeroes of C. For Reed-Solomon codes the set of zeroes 
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has the form {a, . 
from a). 



(a “consecutive string” of zeroes starting 



Exercise 11. 

a. By constructing an appropriate transform (p analogous to the map 

in (3.10), or otherwise, show that the results of Exercise 10 may be 
modified suitably to cover the case of m-dimensional cyclic codes of 
block length n = {q — 1)^. In particular, an m-dimensional cyclic 
code I in ¥q[xi, . . . , Xm]/{xi~^ - 1, . . . , - 1) is uniquely spec- 

ified by giving a set of zeroes — the points of V(J) — in (F*)"^ = 
Y{x\~^ - 1, ... , x^^ - 1). (Readers of Chapter 2 should compare with 
the discussion of finite-dimensional algebras in §2 of that chapter.) 

b. Consider the 2-dimensional cyclic code I in ¥g[x,y]/{x^ — — 1) 

generated by ^(x, y) = x'^y'^ H- 1. What is the dimension of I (i.e., the 
parameter k)7 What is the corresponding set of zeroes in (Fg )^? 

Exercise 12. In this exercise, we will explore the relation between the 
zeroes of a cyclic code and its minimum distance. Let a be a primitive 
element of F^. Consider a cyclic code C of length q—1 over F^ and suppose 
that there exist £ and 6 >2 such that the (5—1 consecutive powers of a: 



are distinct roots of the generator polynomial of C. 

a. By considering the equations = 0, j = 0,...,(5 — 2, satisfied by 

the codewords (written as polynomials), show that the vectors 

can be taken as columns of a parity check matrix H matrix for C. 

b. Show that, possibly after removing common factors from the rows, all 
the determinants of the (6— l)x(^— 1) submatrices of H formed using 
entries in these columns are Vandermonde determinants. 

c. Using Proposition (2.5), show that the minimum distance doiC satisfies 

d>6. 

d. Use the result of part c to rederive the minimum distance of a Reed- 
Solomon code. 

Exercise 13. (The BCH codes) Now consider cyclic codes C of length 

qm _ I Qygj. fQj. some m > 1. 

a. Show that the result of Exercise 12 extends in the following way. Let a 
be a primitive element of ¥qm , and suppose that there exist ^ and 8 >2 
such that the 6 — 1 consecutive powers of a: 



are distinct roots of the generator polynomial g{x) G ¥q[x] of C. Show 
that C has minimum distance d > 6. 
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b. The “narrow-sense” ^-ary BCH code BCHq{rn^ t) is the cyclic code over 
¥q whose generator polynomial is the least common multiple of the 
minimal polynomials of a, . . . , G F^m over ¥q. (The minimal 
polynomial of /3 G F^m over F^ is the nonzero polynomial of minimal 
degree in ¥q[u] with /? as a root.) Show the the minimum distance of 
BCHq{m,t) is at least 2^+1. (The integer t is called the designed 
distance of the BCH code.) 

c. Construct the generator polynomial for BCHs{ 2 ^ 2 ) (a code over F3). 
What is the dimension of this code? 

d. Is it possible for the actual minimum distance of a BCH code to be 

strictly larger than its designed distance? For example, show using 
Proposition (2.5) that the actual minimum distance of the binary BCH 
code BCH2(5,4) satisfies d > 11 even though the designed distance 
is only 9. Hint: Start by showing that if /? G F2m is a root of a poly- 
nomial p{u) G F2M, then so are . Readers familiar 

with Galois theory for finite fields will recognize that we are apply- 
ing the Frohenius automorphism of F2m over F2 from Exercise 14 of §1 
repeatedly here. 

Exercise 14. Prove Proposition (3.8). 

Exercise 15. Reed-Solomon codes are now commonly used in situations 
such as communication to and from deep-space exploration craft, the CD 
digital audio system, and many others where errors tend to occur in 
“bursts” rather than randomly. One reason is that Reed-Solomon codes 
over an alphabet F2^ with r > 1 can correct relatively long bursts of errors 
on the bit level, even if the minimum distance d is relatively small. Each 
Reed-Solomon codeword may be represented as a string of { 2 ^ — l)r bits, 
since each symbol from F2r^ is represented by r bits. Show that a burst of 
r(, consecutive bit errors will change at most ^ + 1 of the entries of the 
codeword, viewed as elements of ¥2r. So if ^ + 1 < [(d — 1)/2J, a burst 
error of length r£ can be corrected. Compare with Proposition (2.1). 



§4 Reed-Solomon Decoding Algorithms 

The syndrome decoding method that we described in §2 can be applied 
to decode any linear code. However, as noted there, for codes with large 
codimension n — fc, a very large amount of information must be stored to 
carry it out. In this section, we will see that there are much better methods 
available for the Reed-Solomon codes introduced in §3 — methods which ex- 
ploit their extra algebraic structure. Several different but related decoding 
algorithms for these codes have been considered. One well-known method 
is due to Berlekamp and Massey (see [Bla]). With suitable modifications, 
it also applies to the larger class of BCH codes mentioned in §3, and it 
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is commonly used in practice. Other algorithms paralleling the Euclidean 
algorithm for the GCD of two polynomials have also been considered. Our 
presentation will follow two papers of Fitzpatrick ([Fitl], [Fit2]) which 
show how Grobner bases for modules over polynomial rings (see Chapter 5) 
can be used to give a framework for the computations involved. Decoding 
algorithms for m-dimensional cyclic codes using similar ideas have been 
considered by Sakata ([Sak]), Heegard-Saints ([HeS]) and others. 

To begin, we introduce some notation. We fix a field and a primitive 
element a, and consider the Reed-Solomon code C C ¥q/{x^~^ — 1) given 
by a generator polynomial 

g = {x — a) ’ • • {x — 

of degree d — 1. By Proposition (3.7), we know that the dimension of C 
is k = q — d, and the minimum distance of C is d. For simplicity we will 
assume that d is odd: d = Then by Proposition (2.1), any t or fewer 

errors in a received word should be correctable. 

Let c = Yl^jZo ^ codeword of C. Since C has generator polyno- 

mial g{x)^ this means that in ¥q[x], c is divisible by g. Suppose that c is 
transmitted, but some errors are introduced, so that the received word is 
y = c-\- e ^OT some e = Eiei 6iX^. I is called the set of error locations and 
the coefficients are known as the error values. To decode, we must solve 
the following problem. 

(4.1) Problem. Given a received word y, determine the set of error lo- 
cations I and the error values e^. Then the decoding function will return 
E-\y - e). 

The set of values Ej = 2/(o'-^), j = l,...,d— 1, serves the same purpose 
as the syndrome of the received word for a general linear code. (It is not 
the same thing though — the direct analog of the syndrome would be the 
remainder on division by the generator. See Exercise 7 below.) First, we 
can determine whether errors have occurred by computing the values Ej . If 
Ej = y{a^) = 0 for all j = 1, . . . , d — 1, then y is divisible by g. Assuming 
t or fewer errors occurred, y must be the codeword we intended to send. If 
some Ej ^0, then there are errors and we can try to use the information 
included in the Ej to solve Problem (4.1). Note that the Ej are the values 
of the error polynomial for j = 1, . . . , d — 1: 

Ej = y{a^) — c{a^) -h e{a^) = e{a^), 

since c is a multiple of g. (As in Exercise 10 from §3, we could also think 
of the Ej as a portion of the transform of the error polynomial.) The 
polynomial 

d-i 

S{x) = 
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is called the syndrome polynomial for y. Its degree is d — 2 or less. By 
extending the definition of Ej = e{a^) to all exponents j we can also 
consider the formal power series 

oo 

(4.2) E(x) = '£Ejx^-\ 

j=l 



(Since the coefficients in E are periodic, with period at most 

and consequently E is actually the series expansion of a rational function 
of x; see (4.3) below. One can also solve the decoding problem by finding 
the recurrence relation of minimal order on the coefficients in E. For the 
basics of this approach see Exercise 6 below.) 

Suppose we knew the error polynomial e for a received word with t or 
fewer errors. Then 

Ej = Y.ei{o^f = 

iei iei 

By expanding in formal geometric series, E{x) from (4.2) can be written 
as 



(4.3) 



where 



- Z. (1 _ 

iei ^ ' 

_ n(x) 

= Wv 



A = J](l - otx) 
iei 



and 

Q. = ^ 6ia^ n ~ a^x). 

iei j^i 
jei 



The roots of A are precisely the for iei. Since the error locations 
can be determined easily from these roots, we call A the error locator 
polynomial. Turning to the numerator fl, we see that 

deg(f2) < deg(A) — 1. 



In addition, 

fi(a-*) = n ^ 

Hence $1 has no roots in common with A. From this we deduce the im- 
portant observation that the polynomials and A must be relatively 
prime. 
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Similarly, if we consider the “tail” of the series E, 



(4.4) 



where 



j—d \iel 

= 

A(X) ’ 






-1 



r = n 

iel j^i 

jei 



The degree of F is also at most deg(A) — 1. 

Combining (4.3) and (4.4), and writing d — 1 = 2t we obtain the relation 

(4.5) Q = AS x^^r. 

For some purposes, it will be more convenient to regard (4.5) as a 
congruence. The equation (4.5) implies that 

(4.6) ft = AS mod x‘^^. 



Conversely, if (4.6) holds, there is some polynomial F such that (4.5) holds. 
The congruence (4.6), or sometimes its explicit form (4.5), is called the key 
equation for decoding. 

The derivation of the key equation (4.6) assumed e was known. But now 
consider the situation in an actual decoding problem, assuming that no 
more than t errors occurred. Given the received word y, S is computed. 
The key equation (4.6) is now viewed as a relation between the k^wn 
polynomials 5, and the unknowns ft, A. Suppose a solution {ft, A) of 
the key equation is found, which satisfies the following degree conditions: 



(4.7) 



f deg(A) < t _ 
\ deg(f^) < deg(A) 



and in which ft, A are relatively prime. We claim that in such a solution A 
must be a factor of x^~^ — 1, and its roots give the inverses of the error 
locations. This is a consequence of the following uniqueness statement. 



(4.8) Theorem. Suppose that t or fewer errors occur in the received word 
y and let S be the corresponding syndrome polynomial. Up to a constant 
multiple, there exists a unique solution {ft. A) of (4-6) that satisfies the 
degree conditions (4‘V} which ft and A are relatively prime. 

Proof. As above, the actual error locator A and the corresponding ft give 
one such solution. Let (f^. A) be any other. Prom the congruences 

ft = AS mod x^^ 

= AS mod 
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multiplying the second by A, the first by A and subtracting, we obtain 

QA = f2A mod 

Since the degree conditions (4.7) are satisfied for both solutions, both sides 
of this congruence are actually polynomials of degree at most 2t — 1, so it 
follows that 



CIA = flA. 

Since A and Cl are relatively prime, and similarly for A and fl, A must 
divide A and vice versa. Similarly for Cl and As a result, A and A differ 
at most by a constant multiple. Similarly for Cl and Cl, and the constants 
must agree. □ 

Given a solution of (4.6) for which the conditions of Theorem (4.8) are 
satisfied, working backwards, we can determine the roots of A = 0 in F*, 
and hence the error locations — if appears as a root, then i G / is an 

error location. Finally, the error values can be determined by the following 
observation. 

Exercise 1. Let {Cl, A) be the solution of (4.6) in which the actual error 
locator polynomial A (with constant term 1) appears. If i G I, show that 

fl(a“") = a^eiXi{a~"), 

where Xi = ~ Hence we can solve for e*, knowing the error 

locations. The resulting expression is called the Forney formula for the 
error value. 

Theorem (4.8) and the preceding discussion show that solving the decod- 
ing problem (4.1) can be accomplished by solving the key equation (4.6). 
It is here that the theory of module Grobner bases can be applied to good 
effect. Namely, given the integer t and S G Fg[a:], consider the set of all 
pairs {Cl, A) G ^q[x]‘^ satisfying (4.6): 

K = {{Cl, A) = AS mod 

Exercise 2. Show that AT is a Fg [x]-submodule of Fg[x]^. Also show that 
every element of K can be written as a combination (with polynomial 
coefficients) of the two generators 

(4.9) gi = 0) and Q 2 = (S, 1). 

Hint: For the last part it may help to consider the related module 

K = {(fi, A, r) : n = A5 + x^‘r} 

and the elements (fi, A, F) = 0, 1), (S, 1, 0) in K. 
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The generators for K given in (4.9) involve only the known polynomials 
for the decoding problem with syndrome S. Following Fitzpatrick, we will 
now show that (4.9) is a Grobner basis for K with respect to one monomial 
order on ¥q[x\^. Moreover, one of the special solutions (A, il) G K given 
by Theorem (4.8) is guaranteed to occur in a Grobner basis for K with 
respect to a second monomial order on ¥q[x]‘^. These results form the basis 
for two different decoding methods that we will indicate. 

To prepare for this, we need to begin by developing some preliminary 
facts about submodules of Fg[a:]^ and monomial orders. The situation here 
is very simple compared to the general situation studied in Chapter 5. We 
will restrict our attention to submodules M C ¥q[x\^ such that the quotient 
¥q[x]‘^/M is finite- dimensional as a vector space over ¥q. We will see below 
that this is always the case for the module K with generators as in (4.9). 
There is a characterization of these submodules that is very similar to the 
Finiteness Theorem for quotients fc[xi, . . . , Xn]/I from Chapter 2, §2. 

(4.10) Proposition. Let k be any field, and let M be a submodule of k[x]‘^. 

Let > be any monomial order on Then the following conditions are 

equivalent: 

a. The k-vector space k[x]^/M is finite- dimensional. 

b. (lt>(M)) contains elements of the form x^ei = (a:^,0) and x^e 2 = 
(0, x'^) for some u,v > 0. 

Proof. Let ^ be a Grdbner basis for M with respect to the monomial or- 
der >. As in the ideal case, the elements of k[x]^/M are linear combinations 
of monomials in the complement of (lt>(M)). There is a finite number of 
such monomials if and only if (lt>(M)) contains multiples of both ei and 
02 . □ 

Every submodule we consider from now on in this section will satisfy the 
equivalent conditions in (4.10), even if no explicit mention is made of that 
fact. 

The monomial orders that come into play in decoding are special cases 
of weight orders on ¥q[x]‘^. They can also be described very simply “from 
scratch” as follows. 

(4.11) Definition. Let r G Z, and define an order >r by the following 
rules. First, x'^Bi >r x'^Si if m > n and i = 1 or 2. Second, x'^B 2 >r x'^bi 
if and only if m + r > n. 

For example, with r = 2, the monomials in k[x]‘^ are ordered by >2 as 
follows: 

01 <2 XBi <2 x“^Bi <2 02 <2 X^Bi <2 XB2 <2 X^Bi <2 * * * • 
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Exercise 3. 

a. Show that >r defines a monomial order on fc[x]^ for each r G Z. 

b. How are the monomials in k[x]‘^ ordered under >_2? 

c. Show that the >o and >_i orders coincide with TOP {term over posi- 
tion) orders as introduced in Chapter 5 (for different orderings of the 
standard basis). 

d. Are the POT {position over term) orders special cases of the >r orders? 
Why or why not? 



Grobner bases for submodules with respect to the y>r orders have very 
special forms. 



(4.12) Proposition. Let M he a submodule of k[x]‘^, and fix r e Z. As- 
sume (lt>^(M)) is generated by x'^gi = and x^Q2 = (0, for 

some u,v > 0. Then a subset Q C M is a reduced Grobner basis of M with 
respect to >r if and only if G = {gi = (511,512), 52 = (521,522)}, where 
the Qi satisfy the following two properties: 

a. LT{gi) = x'^ei (in gu), and ur{g2) = x^Q 2 (in g22) foru,v as above. 

b. deg(52i) < u and deg{gi2) < v. 

Proof. Suppose ^ is a subset of M satisfying conditions a,b. By a, the 
leading terms of the elements of Q generate (lt(M)), so by definition Q 
is a Grobner basis for M. Condition b implies that no terms in gi can be 
removed by division with respect to g2 and vice versa, so Q is reduced. 
Conversely, if ^ is a reduced Grobner basis for M with respect to >r it 
must contain exactly two elements. Numbering the generators gi and p2 as 
above condition a must hold. Finally b must hold if Q is reduced. (Note, 
fixing the leading terms in gi and g2 implies that the other components 
satisfy deg(^i2) + r < u and deg(^2i) < + r.) □ 

An immediate, but important, consequence of Proposition (4.12) is the 
following observation. 

(4.13) Corollary. Let Q — {(S', 1), (a;^*, 0)} be the generators for the 
module K of solutions of the key equation in the decoding problem with 
syndrome S. Then Q is a Grobner basis for K with respect to the order 

>deg(5) • 



Note ((S, 1)) = (0, 1) = 02, so the module of solutions of the key 

equation always satisfies the finiteness condition from Proposition (4.10). 
We leave the proof of Corollary (4.13) as an exercise for the reader. 

The final general fact we will need to know is another consequence of the 
definition of a Grobner basis. First we introduce some terminology. 
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(4.14) Definition. Let M be a submodule of k[x]‘^. A minimal element 
of M with respect to a monomial order > is a, g E M such that lt(^) is 
minimal with respect to >. 

For instance, from (4.13), (S', 1) is minimal with respect to the order 
>deg(S) in ({S, 1), (x2‘,0)) since 

02 = lt((S', 1)) <deg(S) lt((x^*, 0)) = x^‘ei, 

and these leading terms generate (lt(AT)) for the >deg(5) order. 

Exercise 4. Show that minimal elements of M C A:[x]^ are unique^ up to 
a nonzero constant multiple. 

As in the example above, once we fix an order >^, a minimal element for 
M with respect to that order is guaranteed to appear in a Grobner basis 
for M with respect to >r- 

(4.15) Proposition. Fix any >r order on and let M he a sub- 

module. Every Grobner basis for M with respect to >r contains a minimal 
element with respect to >r- 

We leave the easy proof to the reader. Now we come to the main point. 
The special solution of the key equation (4.6) guaranteed by Theorem (4.8) 
can be characterized as the minimal element of the module K with respect 
to a suitable order. 

(4.16) Proposition. Let g = (^^, A) be a solution of the key equation 
satisfying the degree conditions (4-V with components relatively prime 
(which is unique up to constant multiple by Theorem (4-S)). Then g is a 
minimal element of K under the >_i order. 

Proof. An element g = {Q., A) E K satisfies deg(A) > deg(fl) if and only 
if its leading term with respect to >_i is a multiple of C 2 . The elements of 
K given by Theorem (4.8) have this property and have minimal possible 
deg(A), so their leading terms are minimal among leading terms are are 
multiples of 02 - 

Aiming for a contradiction now, suppose that g is not minimal, or 
equivalently that there is some nonzero h = {A, B) in K such that 
lt(/i) <_i lt(^). Then by the remarks above, ur{h) must be a multiple of 
01 , that is, it must appear in A, so 

(4.17) deg(A) > deg(A) > deg{B). 

But both h and g are solutions of the key equation: 

A = SB mod x^^ 

0, = SA mod 
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Multiplying the second congruence by the first by A, and subtracting, 
we obtain 

(4.18) AA = BQ. mod 

We claim this contradicts the inequalities on degrees above. Recall that 
deg(A) < t and deg(fi) < deg(A), hence deg(fi) < t — 1. But from (4.17), 
it follows that deg(A) < t— 1. The product on the left of (4.18) has degree 
at most 2t — 1, and the product on the right side has degree strictly less 
than the product on the left. But that is absurd. □ 

Combining (4.16) and (4.15), we see that the special solution of the key 
equation that we seek can be found in a Grobner basis for K with respect to 
the >_i order. This gives at least two possible ways to proceed in decoding. 

1. We could use the generating set 

{(5,l),(x^‘,0)} 

for RT, apply Buchberger’s algorithm (or a suitable variant adapted to 
the special properties of modules over the one variable polynomial ring 
Fg[x]), and compute a Grobner basis for K with respect to >_i directly. 
Then the minimal element g which solves the decoding problem will 
appear in the Grobner basis. 

2. Alternatively, we could make use of the fact recorded in Corollary (4.13). 
Since Q = {(5, 1), (x^^ 0)} is already a Grobner basis for K with respect 
to another order, and ¥q[x]^/M is finite-dimensional over F^, we can 
use an extension of the Faugere-Gianni-Lazard-Mora (FGLM) Grobner 
basis conversion algorithm from §3 of Chapter 2 (see [Fit2]) to convert 
{(aS, 1), (x^^ 0)} into a Grobner basis Q' for the same module, but with 
respect to the >_i order. Then as in approach 1, the minimal element 
in K will be an element of 

Yet another possibility would be to build up to the desired solution of 
the key equation inductively, solving the congruences 

= AS mod x^ 

for ^ = 1, 2, . . . , 2t in turn. This approach gives one way to understand the 
operations from the Berlekamp-Massey algorithm mentioned above. See 
[Fitl] for a Grobner basis interpretation of this method. 

Of the two approaches detailed above, a deeper analysis shows that the 
first approach is more efficient for long codes. But both are interesting from 
the mathematical standpoint. We will discuss the second approach in the 
text to conclude this section, and indicate how the first might proceed in the 
exercises. One observation we can make here is that the full analog of the 
FGLM algorithm need not be carried out. Instead, we need only consider 
the monomials in Fg[x]^ one by one in increasing >_i order and stop on 
the first instance of a linear dependence among the remainders of those 
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monomials on division by Q. Here is the algorithm (see [Fit 2], Algorithm 
3.5). It uses a subalgorithm called nextmonom which takes a monomial u 
and returns the next monomial after u in Fg[x]^ in the >_i order. (Since 
we will stop after one element of the new Grobner basis is obtained, we do 
not need to check whether the next monomial is a multiple of the leading 
terms of the other new basis elements as we did in the full FGLM algorithm 
in Chapter 2.) 

(4.19) Proposition. The following algorithm computes the minimal ele- 
ment of the module K of solutions of the key equation with respect to the 
>_i order: 

Input: Q = {(5, 1), 0)} 

Output: (Q, A) minimal m K = {Q) with respect to >_i 
Uses: Division algorithm with respect to using >deg(5) order , 
nextmonom 

t\ := (0, 1); R\ := t ^ ; j := 1 
done := false 
WHILE done = false DO 
tj+i := nextmonomftj) 

Rj-\-i •= 

IF there are Ci G Fg with Rj-\-i = THEN 

3 

(D, A) := ^ ^ (^iti 

i=l 

done := true 
ELSE 

3 := j + 1 

Exercise 5. Prove that this algorithm always terminates and correctly 
computes the minimal element of = {Q) with respect to >_i. Hint: See 
the proof of Theorem (3.4) in Chapter 2; this situation is simpler in several 
ways, though. 



We illustrate the decoding method based on this algorithm with an 
example. Let C be the Reed-Solomon code over Fg, with 



g = {x — a){x — a^){x — a^)(x — a^). 
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and d = 5. We expect to be able to correct any two errors in a codeword. 
We claim that 

c = 2x^ + -f 2a: -h 1 

is a codeword for C. This follows for instance from a Maple computation 
such as this one. After initializing the field (a below is the primitive element 
a for Fg), setting c equal to the polynomial above, and g equal to the 
generator, 



Rem(c,g,x) mod 3; 
returns 0, showing that g divides c. 

Suppose that errors occur in transmission of c, yielding the received word 
y = x^ -\- ax^ + (a -h 2)x^ + 2x -|- 1. 

(Do you see where the errors occurred?) We begin by computing the syn- 
drome S. Using Maple, we find y{a) = a 2, y{a^) = = 2, and 

2 /(o:^) = 0. For example, the calculation of y{a) can be done simply by 
initializing the field, defining y as above, then computing 

Normal (subs (x=a,y)) mod 3; 

So we have 



5 = 2x^ -f 2x -f a + 2. 



By Theorem (4.8), we need to consider the module K of solutions of the 
key equation 

Vt = AS mod x"^. 



By Corollary (4.13), Q = {(x^, 0), (2x^ + 2x -f a + 2, 1)} is the reduced 
Grobner basis for K with respect to the order >2- Applying Proposition 
(4.19), we find 



h = (0, 1) 


Ri = 


t2 = (1,0) 


R2 = 


ta = (0, a;) 


R3 = 


ti = (x,0) 


Ri — 


ts = (0, x^) 


Rs = 



(x^ -|- X -f- 2 q; -|- 1, 0) 
( 1 , 0 ) 

(x^ + x^ + (2a -f l)x, 0) 
(x, 0) 

(x^ -f (2a + l)a:^, 0). 



Here for the first time we obtain a linear dependence: 



i ?5 — —{aRi + (a + l)i?2 “ 1 “ 2i?3 + (o; + l)i?4). 



Hence, 

ati -f- (a + 1)^2 "1“ 2^3 + (q; -f 1)^4 + ^5 = (ex + 1 + (o; + l)x, a + 2x + x^) 
is the minimal element (f2. A) of K that we are looking for. 
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The error locations are found by solving 

A(x) = 2x a = 0. 

Recall, by definition A = constant term 1 , so we need to 

adjust constants to get the actual error-locator polynomial and the correct 
Q to use in the determination of the error values, using the Forney formula 
of Exercise 1. Dividing by a, we obtain A = (a + l)x^ + {2a + 2)x H- 1. 
By factoring, or by an exhaustive search for the roots as in 

for j to 8 do 

Normal (subs (x = a^j , Lambda) mod 3; 
od; 

we find that the roots are x = and x = a^. Taking the exponents of 
the inverses give the error locations: = a^ and == so 

the errors occurred in the coefficients of x^ and x^ , (Check the codeword c 
and the received word y above to see that this is correct.) Next, we apply 
Exercise 1 to obtain the error values. We have 



Q = (l/a)((a + l)x + O' + 1) = (a + 2)x -h o; + 2. 

For the error location z = 2, for instance, we have X 2 {x) = 1 — a^x, and 



62 



n{g-^) 



= a + 1. 



This also checks. The error value 65 = a -h 1 is determined similarly; to 
decode we subtract e = (a + l)x^ + (a + from t/, and we recover the 
correct codeword. 

In the Exercises below, we will consider how (part of) a direct calcula- 
tion of the Grobner basis for K with respect to >_i can also be used for 
decoding. 



Additional Exercises for §4 

Exercise 6 . Let (D, A) be any solution of the congruence (4.6), where S 
is the syndrome polynomial for some correctable error. 

a. Writing A = ^ ” Sj=i ^ show that (4.6) yields 

the following system of t homogeneous linear equations for the t -f 1 
coefficients in A: 

t 

(4.20) ^ AkEt-\.£-k = 0 

k=0 

for each £ = 1 , . . . , L 

b. Assuming no more than t errors occurred, say in the locations given 

by a set of indices /, Et-\.£-k = Yliei for some polynomial 
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e{x) with t or fewer nonzero terms. Substitute in (4.20) and rearrange 
to obtain 

t 

0 = ^ KkEt+^-k 

(4.21) 

iei 

c. Show that the last equation in (4.21) implies that A(o'“^) = 0 for all 
i E I, which gives another proof that A divides A. Hint: the equations 
in (4.21) can be viewed as a system of homogeneous linear equations in 
the unknowns e^A(a“^). The matrix of coefficients has a notable special 
form. Also, ^ 0 for i G /. 

Solving the decoding problem can be rephrased as finding the linear 
recurrence relation (4.20) of minimal order for the the Ej sequence. The 
coefficients A^ then give the error locator polynomial. 

Exercise 7. A direct analog of syndrome decoding for Reed-Solomon codes 
might begin by computing the remainder on division of a received word y by 
the generator, giving an expression y = c-\- R, where c is a codeword. How 
is the remainder R related to the error polynomial e? Is this c necessarily 
the nearest codeword to y7 (There is another decoding method for Reed- 
Solomon codes, due to Welch and Berlekamp, that uses R rather than 
the syndrome S, It can also be rephrased as solving a key equation, and 
Grobner bases can be applied to solve that equation also.) 

Exercise 8. Prove Corollary (4.13). 

Exercise 9. Prove Proposition (4.15). Hint: Think about the definition of 
a Grbbner basis. 

Exercise 10. Consider the Reed-Solomon code over Fg with generator 
polynomial g = {x — a){x — a^) {d = 3, so this code is 1 error-correcting). 
Perform computations using Proposition (4.19) to decode the received 
words 

y{x) = 4- ax^ + (a + 2)x^ + (a + l)x^ -h a; + 2, 

and 

y{x) = x^ x^ ax^ + (a -h l)a;^ + (a + l)x^ x 2a. 

What are the solutions of A = 0 in the second case? How should the 
decoder handle the situation? 

Exercise 11. In this and the following exercise, we will discuss how a 
portion of a direct calculation of the Grobner basis for K with respect to 
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>_i starting from the generating set {^ 1 ,^ 2 } = 0)? (S', 1)} can also 

be used for decoding. Consider the first steps of Buchberger’s algorithm. 
Recall that S has degree 2^ — 1 or less. 

a. Show that the first steps of the algorithm amount to applying the 1- 
variable division algorithm to divide 5 into yielding an equation 

= qS -\-R, with a quotient q of degree 1 or more, and a remainder R 
that is either 0 or of degree smaller than deg S. This gives the equation 

(x2‘,0) =g(5,l) + (i?, -q). 

b. Deduce that ^2 and gs = (R, —q) also generate the module AT, so gi can 
actually be discarded for the Grobner basis computation. 

c. Proceed as in the Euclidean algorithm for polynomial GCD’s (see e.g. 
[CLO], Chapter 1, §5), working on the first components. For instance, 
at the next stage we find a relation of the form 

(S', 1) = qi{R, —q) + (-Ri, qiq + !)• 

In the new module element, g^ = {Ri,qiq + 1), the degree of the first 
component has decreased, and the degree of the second has increased. 
Show that after a finite number of steps of this process, we will produce 
an element {Q, A) of the module K whose second component has degree 
greater than the degree of the first, so that its >_i leading term is a 
multiple of 02 . 

d. Show that the element obtained in this way is a minimal element K with 
respect to >_i. Hint: It is easy to see that a minimal element could be 
obtained by removing any factors common to the two components of 
this module element; by examining the triple (Jl, A, P) obtained as a 
solution of the explicit form of the key equation: ft = AS + show 
that in fact ft and A are automatically relatively prime. 

Exercise 12. Apply the method from Exercise 11 to the decoding problem 
from the end of the text of this section. Compare your results with those of 
the other method. Also compare the amount of calculation needed to carry 
out each one. Is there a clear “winner”? 

Exercise 13. Apply the method from Exercise 11 to the decoding 
problems from Exercise 10. 



§5 Codes from Algebraic Geometry 

In the last 15 years, algebraic geometry has been used extensively in the 
study of a very interesting class of codes — the geometric Goppa codes ^ 
named after their discoverer, V. D. Goppa. Some of these codes have ex- 
tremely good parameters and the 1982 paper [TVZ] establishing this fact 
was a major landmark in the history of coding theory. Our goal in this 
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section will be to give an introduction to these codes using a somewhat 
unconventional approach. We will treat the Goppa codes in parallel with 
a second, somewhat simpler family of codes called the Reed-Muller codes. 
The constructions we present for both of these families may be seen as 
generalizations of the way we developed the Reed-Solomon codes in §3. 

Our reason for this approach is that a complete understanding of the 
geometric Goppa codes requires many notions from the classical theory of 
algebraic curves or function fields of transcendence degree one — divisors, 
linear systems, differentials, the Riemann-Roch theorem, the Jacobian va- 
riety, and so forth. Because Goppa’s construction uses curves defined over 
finite fields, there is also a large number-theoretic, or arithmetic aspect of 
the theory. These subjects are, unfortunately, rather far outside the scope 
of this book. On the other hand, there is a subclass of the geometric Goppa 
codes for which a reasonably elementary description is possible and these 
are the ones we will consider. Even though they are not the most general 
Goppa codes, they are related by duality to codes that have been studied 
intensively by coding theorists. One can construct reasonably efficient de- 
coding algorithms for these dual codes, using some of the same ideas as in 
our discussion of Reed-Solomon decoding algorithms from §4. See for in- 
stance [HeS] and the survey [HP]. We hope our preliminary treatment will 
entice the reader to pursue this subject further. To do this, we recommend 
consulting either the references [Mor] , [vLvG] , which give full presentations 
of geometric Goppa codes using the language of curves, or [Sti], which uses 
the somewhat more direct language of function fields in one variable. 

We will begin by discussing the (generalized, or g-ary) Reed-Muller codes. 
These codes may be defined using a very direct generalization of the con- 
struction of the Reed-Solomon codes from §3 of this chapter — in (3.4) we 
simply replace the points on the affine line by the points in a higher di- 
mensional affine space, and the one- variable polynomials by multivariable 
polynomials. Namely, let us fix some numbering {Pi, P 2 , . . . , Pqm} of the 
points of the m-dimensional affine space over ¥q. (The convention here is 
to include the points with zero coordinates to make the block length as big 
as possible. We could have done that in (3.4) also; the resulting code would 
be what is known as an extended Reed-Solomon code.) 

For each u > 0 , let Ljy be the vector subspace of Fg[^i, . . . , tm] consisting 
of all polynomials of total degree < u. As in §3, we evaluate the functions 
in Ljy at our set of points Pi to form codewords. Formally, we write the 
evaluation mapping as 



evjy : 

Then the Reed-Muller code RMq{m, v) is, by definition, the image of eVy. 
It is easy to see that for each m > 1 and each v > 0 , RMq{m, v) is a linear 
code of block length over F^. 
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For example, evaluating the polynomials in the basis { 1 , ^2, ^ 1 } 

for Z/2 C F3[^i, ^2] at the points in F3 , numbered as 

Pi = ( 0 , 0 ), P2 = ( 1 , 0 ), Ps = ( 2 , 0 ), P4 = ( 0 , 1 ), P5 = ( 1 , 1 ), 

Pe = (2, 1), P7 = (0, 2), Ps = ( 1 , 2), P9 = (2, 2), 
the Reed-Muller code RMs{2, 2 ) is spanned by the rows of the generator 

1 1 1 1 1 1 \ 

0 12 0 12 
1112 2 2 
0 110 11 ‘ 

0 1 2 0 2 1 
111111/ 

Exercise 1 . What are the dimension and the minimum distance of the 
code PMs( 2 , 2 )? 

Binary Reed-Muller codes (those with q = 2) were the first to be intro- 
duced (but by a different method from the one presented here). The code 
PM2(5, 1 ) was used, for instance, in communications with the Mariner 
Mars exploration craft launched in the late 1960 ’s. There is a very simple 
decoding algorithm called majority logic decoding that can be used for the 
binary Reed-Muller codes. However, Reed-Muller codes have been largely 
superseded in practice by more powerful codes such as the Reed-Solomon 
and BCH codes. 

The geometric Goppa codes we will discuss can be obtained by punc- 
turing certain subcodes of these Reed-Muller codes. Puncturing means 
removing some subset of the q'^ entries of the Reed-Muller codewords to 
make words in a new code with a smaller block length. Moreover, we will 
evaluate only the polynomials in a vector subspace L C for some u at 
these remaining points. 

Here is an elementary example. In the affine plane (m = 2) over the field 
F4, consider the points satisfying the polynomial equation + ^2 = 0 

(the equation of a curve in the plane). There are exactly eight such points. 
Writing a for a primitive element of F4 (a root of + a + 1 = 0 ), the 
eight points can be numbered as follows: 



Pi = 


(0,0, 1) 


P 2 = 


(0, 1, 1) 


P 3 = 


(1,0:, 1) 


Pi = 


(l,aM) 


P 5 = 


(0,0, 1) 


Pe - 


( 0 , 0 ^, 1) 


Pr - 


(o^, 0 , 1) 


Ps = 


(o^, o^, 1) 



We can construct a code by evaluating the polynomials {l,ti,t2} in Li, 
but only at the eight points Pi above (the puncturing as described above). 
This gives the following generator matrix for a code of block length n = 8 



matrix 



( 5 . 1 ) 



G = 



/I 1 1 
0 1 2 
0 0 0 
0 1 1 
0 0 0 
\0 0 0 
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over F4: 

/I 1 1 1 1 1 1 1 \ 

(5.2) G=jo0 1 1 a OL 

y 0 1 a a ol o? j 

(This is the same as the code from (2.4) of this chapter.) It is easy to see 
that the rows of this matrix are linearly independent, so the resulting code 
has dimension k = 3 . Note that we could evaluate the polynomials in a 
basis for any subspace of one of the Lj^ to produce a code in a similar 
fashion. For example, using C L2 would give a punctured 

code with n = S and k = A obtained from a subcode of the Reed-Muller 
code RM^{ 2 , 2). 

Goppa’s key insight about this idea of constructing codes by evaluat- 
ing functions was that especially interesting codes could be obtained if the 
points remaining after the puncturing operation were the points with co- 
ordinates in ¥q on an algebraic curve X (that is, a variety of dimension 1) 
defined over the field ¥q. We call these the ¥q -rational points on X, and 
use the notation X(F^) for the set of all F^-rational points on X. Moreover, 
the functions to be evaluated should be chosen in a suitable vector space 
of functions defined with reference to the curve X, 

It is precisely here that the more advanced topics in algebraic geometry 
that we wish to avoid enter into the general construction. So in our pre- 
sentation we will make some strong assumptions about X that allow us to 
remain within the polynomial context used for the Reed-Muller codes. The 
following description of the classes of curves and functions we will consider 
is intended primarily for readers who do have some prior knowledge of the 
general theory of algebraic curves, but not necessarily of general geomet- 
ric Goppa codes. Others may wish to skip the following definition and the 
paragraphs following it on a first reading. 

(5.3) Definition. We will say a curve X is in special position if the 
following conditions are satisfied. 

a. The projective closure X of X in the projective space over an alge- 
braic closure F of ¥q is a smooth curve, and the homogeneous ideal of 
X is generated by polynomials with coefficients in F^. (More abstractly, 
X is defined over the field F^.) 

b. The projective curve X has only one point Q in the hyperplane at 
infinity, V(to), and Q also has coordinates in ¥q. 

c. Finally, we require that the orders of the poles of the rational functions 
ti/to at Q generate the semigroup of pole orders at Q of all rational 
functions on X having poles only at Q. 

For instance, X = V(tf -f t^to + ^2^0) algebraic closure of 

F4, the curve used in the example above, is in special position. First, X is 
defined over F2 C F4, and it is easy to check that X has no singular points. 
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Moreover, X fl V(to) = {(0, 0, 1)}, so X has only one F 4 -rational point on 
the line at infinity. Finally, the rational function t\/tQ on X has zeros at 
(0, 1, 0) and (1, 0, 1), and a pole of order 2 at Q = (0, 0, 1). Similarly, the 
rational function ^ 2/^0 has a zero of order 3 at (1, 0, 0) and a pole of order 
3 at Q. The semigroup generated by the pole orders 2, 3 in Z>o omits only 
the integer 1 (it has only one “gap”). Since X is a curve of genus 1 (see 
below), the Weierstrass Gap Theorem (see, for example, [Sti]) implies this 
is the entire semigroup of pole orders at Q. 

We will only consider codes constructed from curves in special position 
in the following. This restriction is strong, but perhaps less strong than it 
appears at first. By standard techniques from algebraic geometry, it can 
be shown that given any point Q E X, there exists a different projective 
embedding of X such that the image of Q is the only point at infinity on 
the new, but birationally isomorphic, image curve. (See Exercise 13 below 
for an example illustrating how this works.) Replacing X by the image of 
such an embedding, the rational functions on X with poles only at Q can be 
identified with the polynomial functions from Fg[ti, . . . , t^], restricted to 
the affine curve X. By the last condition in (5.3), it suffices to consider only 
vector subspaces of Fg[ti, . . . , tm] spanned by collections of monomials in 
order to obtain the full class of what are known as “one-point” geometric 
Goppa codes — those constructed by evaluation of functions in one of the 
spaces L{aQ) of rational functions on X with poles of order < a at Q and 
no other poles. The duals of these codes form the special class of geometric 
Goppa codes mentioned above that has been extensively developed by cod- 
ing theorists, and for which relatively good decoding algorithms are now 
known to exist. 

Let V = {Pi, . . . , Pn} be some collection of points on the affine curve 
X (as always, in special position), whose coordinates all lie in F^. Let L be 
some Fg-vector space of polynomial functions with coefficients in Fg. Our 
geometric Goppa codes are the codes obtained as images of the evaluation 
mappings 

6Vl:L^¥^, f^{f{Pl),...,f{Pn)). 

We will call the resulting code Cx(P, L), or just C(P, L) if the curve X is 
clear from the context. 

One reason that Goppa understood that these codes would be interesting 
is that, as for the Reed-Solomon codes, the number of zero entries in any 
codeword (the number of zeros of a function at the Fg-rational points on 
X) is strongly limited by the form of the functions in L. On the other hand, 
evaluating functions at the points on a general curve, not just on the affine 
line as for Reed-Solomon codes, gives the possibility of finding codes with 
bigger values of n, hence more codewords. 

Indeed, a celebrated theorem of Basse and Weil (the analog of the Rie- 
mann Hypothesis for the function field of the curve) states that the number 
of Fg-rational points on a projective curve over Fg satisfies and equation of 




§5. Codes from Algebraic Geometry 453 



the form 



_ 

|X(F,)| = l+q-'£^ai 

2=1 



where the are algebraic numbers in C with \ai\ = y/q for all i, and the 
integer ^ > 0 is an invariant of X called the genus. (See [Mor] or [Sti] for 
full discussion of the Hasse-Weil theorem.) It follows that 



|^(IFg)| < 1 + 9 + 2gy/q. 

One definition of g for a smooth curve X C is the following. The 
ideal I = l(X) C R = Fg[to, . . • , ^m] is a homogeneous ideal, and in §4 
of Chapter 6, we studied the Hilbert polynomial Recall that for 

sufficiently large values of i/, the Hilbert polynomial satisfies 

HPr/i{v) = = dimFg Ry - dimp^ ly. 

In our situation, X C P"^ is a smooth curve, which, as noted in §4 of 
Chapter 6, implies that HPr/i{u) is of the form du + e, where d is the 
degree of X {= number of intersections of X with a generic hyperplane in 
P"^). Then the genus of X is defined to be the number ^ = 1 — e. 

For readers familiar with [CLO], note that the Hilbert polynomial 
HPr/i{v) used here is identical with the Hilbert polynomial HPi{v) 
discussed in Chapter 9, §3 of [CLO]. 

As an example of how this works, suppose that X is a smooth projective 
plane curve of degree d. This means that I = (/) C R = ^q[to,ti,t 2 ], 
where / = 0 is the defining equation of X. Then I R{—d) as an i?-module 
since / has degree d, which gives the exact sequence 

0 R{-d) R-> R/I -^0. 



Prom here, the techniques of Chapter 6, §4 easily show that for sufficiently 
large i/. 



(5.4) 






= dv — 



- 3d 



dv 1 — 



(d- l)(d-2) 



Hence the genus for a plane curve of degree d is given by 

(5.5) g = 2 • 

For readers who haven’t studied Chapter 6, we should mention that (5.4) 
can be proved by more elementary methods, such as those used to prove 
equation (2.16) in Chapter 8. We should also point out that the above 
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calculation above makes sense for singular plane curves as well. In that 
case the number [d — l){d — 2)/2 is called the arithmetic genus. 

The genus has a strong effect on the number of F^-rational points. For 
instance, over F4, a curve of genus zero can have at most 1 + 4 = 5 rational 
points, while according to the Hasse-Weil bound, a curve of genus 1 could 
have as many as 

l + 4 + 21-\/4 = 9 

F4-rational points. Although it is not always possible to find curves over 
¥q attaining the Hasse-Weil bound, for ^ = 1 and 9 = 4, they do exist. 
Indeed the projective curve X = V(tf + ^3^0 + ^ 2 ^ 0 ) projective closure 
of the variety used to construct the code given by (5.2)) is such a curve. It 
is easy to check that X is a smooth curve of degree 3, so by the formula 
(5.5), 9 = 1. Moreover, X has 9 F4-rational points — the eight points used 
above, and the one further point Q = (0, 0, 1) on the line at infinity in P^. 

The curve in this example is the first of the family of Hermitian curves. 
There is a Hermitian curve defined over each field of square order, F^2: 
Xm = V — t'^to — ^2^^). The following exercise gives some interesting 
properties of these curves; there are many others too! 

Exercise 2. 

a. Show that the projective Hermitian curve Xm is smooth of genus g = 
m{m — l)/2. 

b. Find the set V of all 27 Fg-rational points on the affine curve X3 = 
V(tf — ^2 ~ ^2) and construct the generator matrix for the code C(V, L) 
where L = {1, ^1,^2, hh}- 

c. How many Fie-rational points does the affine curve X\q have? Find 
them. 

d. Show that for all m, the projective Hermitian curve Xm attains the 
Hasse-Weil bound over F^2 : 

\Xm{¥m‘ 2 )\ = 1 + + m(m — l)m = 1 + 

(do not forget the point at infinity). 

e. (For readers familiar with the theory of algebraic curves or function 
fields.) Show that every Hermitian curve Xm is in special position as in 
Definition (5.3). Hint: Show that the the rational functions ti/t^ and 
^2/^0 have pole orders m and m + 1 respectively at Q = (0, 0, 1). Then 
use the Weierstrass Gap Theorem to conclude that these pole orders 
generate the whole semigroup of pole orders at Q — the point Q is a 
Weierstrass point of high Weierstrass weight. 

Exercise 3. (The Klein Quart ic curve) Consider the curve 
K = V(tf^2 H“ ^2^0 + ^0^1) 

in the projective plane, defined over the field Fs = F2[a]/(a^ + a + 1). 
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a. Show that K is smooth of degree 4, hence has genus g = S. 

b. In the next parts of the problem, you will show that K has 24 points 
rational over Fs- First show that the three points 

go - (1, 0, 0), Qi = (0, 1, 0), Q2 = (0, 0, 1) 

are Fs-rational points on K. 

c. show that the mappings 

and 



take the set ^^(Fs) to itself. 

d. Deduce that K{¥s) contains 21 points Pij in addition to the Q^: 

Pij - r'(a^(Poo)), 

where Pqo == (1, + a) G Pr(F8). 

e. Deduce that |AT(F8)| = 24 exactly. Hint: what does the Hasse-Weil 
bound say in this case? Then use part c. 

f. Does K satisfy Definition (5.3)? Why or why not? Also see Exercise 13 
below. 

Determining the parameters of the Reed-Muller and geometric Goppa 
codes (in particular the minimum distance) is quite subtle, but it forms 
a nice application of some of the tools we have developed, so we will 
consider this in some detail. We begin with the Reed-Muller codes. To 
understand k = dim PMg(m, i/), notice that if v is sufficiently large, the 
polynomial relations — U = ^ that hold identically on can produce 
linear dependences between the vectors evjy{f) for f E Lj^. 

To study these dependencies systematically, we will introduce the affine 
Hilbert function of an ideal / C Fg[^i, . . . , which is defined as follows. 
Given i/, let be the vector space of elements of total degree < u in I. 
Then the affine Hilbert function is defined to be 

"^HFi{v) = dimFg Fg[^i, . . . , tm]v ~ dimp^ lu, 

where Fg[ti, . . . , tm]v is the set of all polynomials of total degree < u. The 
affine Hilbert function is studied in detail in Chapter 9, §3 of [CLO]. In 
Exercise 14 below, we will see how this relates to the Hilbert functions 
discussed in Chapter 6 of this book. 

The crucial fact is that dim PMg(m, u) is the affine Hilbert function of 
ideal generated by the t^ — U for 1 < i < m. 

(5.6) Proposition. For all m, the dimension of RMq{m^ v) is equal to 
the value of the affine Hilbert function ^HFi{v)j where I = {tj — U : i = 
1, . . . ,m). 
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Proof. The kernel of the linear mapping evy consists of the polynomials 
of total degree < in Fg[^i, . . . , tm] vanishing at all points of FJ^. We claim 
that this kernel equals the vector space of elements of total degree < u 
in /. You will prove this in Exercise 8 below. By the dimension theorem of 
linear algebra, it follows that 

dimpg eVy{Lj,) = dimF^ Ly - dimF^ ly. 

However, is defined to be Fg[ti, . . . , tm]y, so that we are done by the 
definition of affine Hilbert function. □ 



Because of the simple form of the generators of /, one easily sees that 
(lt(/)) = (tj, . . . , t ^) for any monomial order. By Proposition 4 of Chap- 
ter 9, §3 of [CLO], the affine Hilbert function of an ideal and its ideal of 
leading terms coincide, so that 

We can use this fact to compute ^HFj. For each i = 1, . . . , m, let Si{i') be 
the set of monomials of total degree < u divisible by in Fg[ti, . . . , tn]- 
Then 

=("* + ■')- dim 
= -|SiMu-"US„W|. 

We have 15i(i/)| = for each i, |5i(i/) n 5j(i/)| = for each 

pair i ^ and so forth. Hence, by the inclusion-exclusion principle (as 
explained in Chapter 9, §2 of [CLO]), 



(5.7) 






+ v - jq 

jq 



In Exercise 14, we will see that this formula has a nice interpretation in 
terms of the exactness of a certain Koszul complex. 

For example, with m = 2, = 2, g = 3, we obtain 



dim iiMa (2, 2) 




= 6 . 



This should agree with your result from Exercise 1. The following exercise 
gives another way to compute the dimension of RMq{m^v)^ using some 
generating function techniques. 
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Exercise 4. The standard monomial basis of the quotient ring 
. . . , - U : i = 1, . . . ,m) 

consists of all monomials where 0 < pi < g — 1 for all z. 

a. Show that the number of such monomials of degree exactly equal 

to ly is the same as the coefficient of in the expanded form of the 
one-variable polynomial (1 H- + • • • + in Z[u]. 

b. Before collecting like terms in this expansion, each term has the form 

. . . y^^m-1 

for some 0 < < ^ — 1. Show that using the £i as digits in a base-g 

expansion, the coefficient of can also be expressed as the number of 
integers £ in the range 0 < ^ — 1 whose base-^ expansion has digit 

sum equal to u. 

c. If ^ is the base-g expansion of the integer write Wq{£) for 

the digit sum £i. Deduce that 

dim i?Mg(m, u) = \{£ : 0 < £ < — I, and Wq{£) < iy}\. 

The dimensions of geometric Goppa codes can be determined in a similar 
fashion, except that the equations defining the curve X must also be taken 
into account. For instance, consider the Hermitian curve X4 over the field 
Fi 6. Let V be the set of all 64 affine Fie-rational points on X4 (listed in 
some particular order), and consider the codes C{V,L) for L = the 
full vector space of polynomials of degree < i/ in ^i, ^2- Here the kernel of 
the evaluation mapping 

evL, : ->■ Ffe^ 

(/(Pi),...,/(P64)) 
consists of the elements in where 

J = (^1 + ^2 + ^ 2 , ^ 1 ^ + tly ^ 2 ^ + ^ 2)5 

the ideal of the finite set X4(Fi6). A Grobner basis for J with respect to 
the lex order with > t2 is given by 

G = {t I + 1 2 ^2? ^1(^2^ + ^2 + ^2 + ^2 “b 1)5 ^2^ + ^2}* 

Hence as in Proposition (5.6), the dimension of C{V^ Lj^) can be determined 
using the affine Hilbert function of the ideal J. However, this equals the 
Hilbert function (as defined in Chapter 6 of this book) of the projective 
curve X4. This has degree 5 and hence, by (5.5), genus 6. It follows that 
its Hilbert polynomial is 5z/ + 1 — 6 = — 5. Hence the dimension of 

C{V, Ly) is given by 



dimpie C{V, Ly) = 5 • z/ - 5, 
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In particular, for z/ = 6, we have diniFie C{V^ Lq) — 25. A basis for the 
resulting code will consist, for example, of the images of the 25 monomials 



{1, ti, t2, 4,..., 4, ti..,, ti 4ti ...,4} 



under the evaluation mapping on the points in V. Note that tf is not 
included because that monomial reduces to the linear combination 4 + ^2 
modulo G. For u > 16, we would need to reduce by the H- U as well. 
For proper subspaces of the spanned by collections of monomials, we 
could proceed in a similar fashion, taking remainders on division by G. For 
instance, the above calculation also shows that for the subspace L C Lq 
spanned by all monomials of degree < 6 except the resulting code 

C(7^, L) has dimension k = 23. 

A more advanced approach valid in general for the one-point codes ob- 
tained from evaluation of the rational functions in L = L{aQ) would be to 
use the Riemann-Roch Theorem to determine the dimension of the vector 
space L{aQ). For a > 2^ — 1 for instance, that result would imply 



dimFg L{aQ) ^ a I - g. 

(For a < 2^ — 2, there would also be a non- negative correction term added 
to the right-hand side — see [Sti] for details.) Then, the dimension of the 
subspace of L{aQ) consisting of functions vanishing at all the points in V, 
which can also be computed via Riemann-Roch, would be subtracted to 
yield the dimension of C(P, L). 

Next we return to Reed-Muller codes and consider the problem of deter- 
mine the minimum distance. If we delete codeword entries corresponding to 
points in with some coordinate equal to zero, the resulting punctured 
Reed-Muller codes of block length {q — 1)'^ are m-dimensional cyclic codes 
as defined in §3, under a suitable reordering of the entries. 



Exercise 5. Explain how the remaining entries in the codewords of a 
Reed-Muller code can be ordered so that the punctured code is invariant 
under all m one- variable cyclic shifts, after the entries corresponding to 
points in with some zero coordinate are deleted to yield codewords of 
block length (g — 1)"^. 

Some analogs of this result also exist for geometric Goppa codes from 
curves with many automorphisms. See Exercise 12 below for a simple ex- 
ample, and [HLS] for a systematic discussion. In the following discussion, 
we will denote by RMq{m, z/)* the code obtained by deleting only the en- 
try corresponding to the origin. This resulting once-punctured code has an 
even more interesting property. 

(5.8) Theorem. Under a suitable reordering of the entries of the 
codewords j RMq{m, uy is a cyclic code of block length — 1. 
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In coding theory texts, this fact is usually expressed by saying that 
RMq{m^vY is equivalent to a cyclic code; two codes of block length n 
are said to be equivalent if there is some fixed permutation of the entries 
in vectors of length n that takes one code into the other. 



Proof. By the results of §1 of this chapter, for each m > 1, there exists 
an irreducible polynomial of degree m in Fg[u], hence a finite field F^m of 
order containing F^. Pick a primitive element a for F^m, and write the 
monic minimal polynomial of a over ¥q as 

(5.9) /('^) ~ “b ^ • • • + C\U + Cq. 

The powers 1, a, . . . , form a basis for F^m as a vector space over 

Fq, so the mapping 



(5.10) 



yp : F ™ ^ 

(ao, . . . , Om-l) I— ^ Uo "b + • • • + am-lOi 



m—1 



defines an isomorphism of Fq-vector spaces and lets us identify Fq"^ 
with Fqm. Under this identification, the points different from the origin 
correspond to the nonzero elements of the field Fqm . 

Let A be the m x m companion matrix of the polynomial / from (5.9): 




0 



1 0 
0 1 

0 0 



0 \ 
0 

1 



\ —Co —Cl — C2 



Cm— 1 y 



In Exercise 9 below, you will show that for all row vectors x G F^"^, (p(xA) = 
a(p(x), where the right-hand side is the product in Fqm. It follows that 
A^ equals the m x m identity matrix. In particular A is invertible so 
that if P ^ (0, . . . , 0), then PA 7^ (0, ... , 0). 

Relabel the elements in FJ^ \ {(0, . . . , 0)} as P^, 0 < z < — 2, so that 

(f{Pi) = a'^ E Fqm for all i. Order the entries in the punctured Reed-Muller 
codewords accordingly. Then for each codeword (/(Pq), . . . , /(Pqm_2)) G 
PMq(m, z/)*, the word (/(PqA), . . . , f{Pqm_ 2 A)) is just the left cyclic shift. 
But now consider the vector of variables t = (ti, . . . , t^). The entries of 
tA are homogeneous linear polynomials in the so for each / G the 
polynomial f{tA) is also in Ly. Putting these observations together, we see 
that the cyclic shift on Fq^ leaves RMq{m,u)* invariant. As a result, 
under the reordering of the entries of codewords as above, PMq(m, u)* is 
a cyclic code. □ 



By Exercise 12 of §3 of this chapter, we can obtain a lower bound on the 
minimum distance from the analysis of the roots of the generator polyno- 
mial of a cyclic code. To identify the roots of the generator polynomial of 
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RMq{m^ ly)*, we need the following consequence of Exercise 8 from §1 of 
this chapter. 

(5.11) Lemma. Let f G Fg[^i, . . . , tm] be any polynomial of total degree 
< m{q — 1). Then 

Y, m = 0 e F,. 

xe¥^ 

Proof. By linearity, it is enough to prove the claim when / is a monomial: 
f = t^, where (3 = (/?i, . . . , /J^). If some /3i = 0, then the result follows 
immediately since the sum of any q identical terms is zero in Fg. Now 
assume Pi > 0 for all i. This implies we can ignore vectors x with zero 
components, since 

(5.12) Y = Y 

Because the monomial t^ has total degree less than m{q — 1), some pi must 
be less than ^ — 1. Letting a be a primitive element for Fg, since ^ 0, 1 

xieFj j=o 

9-2 

j=o 

= 0 

by Exercise 8 from §1. Hence the sum on the right of (5.12) is zero. □ 

The roots of RMq{m, u)* may be found by an argument similar to the 
one we used in determining the roots of a Reed-Solomon code (see the 
discussion around (3.6) of this chapter). First we express each codeword as 
the coset of a polynomial in ¥q[x]/{x^ — 1): 

(5.13) (/(Po), f(PoA), . . . , /(PoA«"*-2)) ^ “y fiPoA^)x^- 

j=o 

Writing g{x) for the generator polynomial of RMq{m,i')*, we have the 
following description of the roots. For each integer let Wq{£) be the sum of 
the digits in the hase-q expansion of the integer i. (That is, if ^ 
where 0 < < g — 1, then Wq{£) = £i-) 

(5.14) Proposition. Let v < m{q — 1) and let a he the primitive el- 
ement for Fgm as above. Then is a root of the generator polynomial 
for i?Mg(m, z/)* if and only if 0 < £ < q^ — 1, and 0 < Wq{£) < 
m{q — 1) — u — 1. 
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Proof. From the identification of and Fgm given in (5.10), we have 
that the element G F^m can be expressed using the inner product 
as = {PoA^^b)^ where b = (1, a, . . . , G FJSx (recall Pq corre- 
sponds to 1 G ¥qm). Hence if f{PoA^)x^ is the polynomial form of 

a punctured Reed-Muller codeword, then 

q^-2 q^-2 

f(PoA^)ic^y = E f{PoAn{PoA^,by 

j=o j=0 

= E 
= E 

( m 

E 

S=1 

where the last equality uses the fact that tj = U at the points in FJ^. The 
product in the term in the sum on this last line is a polynomial of total 
degree = '^qi^) f^^e U. Hence the product with f{t) is a polynomial 
of total degree u + Wq{£)^ which is < m{q — 1) — 1 by the hypothesis on £. 
By Lemma (5.11), the sum is zero. 

It follows that the generator polynomial for RMq{m,iy)* (and conse- 
quently every codeword polynomial) is divisible in Fgm[x] by h{x) = 
Yl^{x — o:^), where £ runs over all integers < — 1 and satisfying 

0 < Wq{£) < m{q — 1) — i/ — 1. In fact, raising the roots of h{x) to 
the ^th power (applying the Probenius automorphism of F^m over F^) just 
permutes the set of roots, so it is easy to see that h{x) has coefficients 
in ¥q. The code with generator polynomial h{x) contains RMq{m, i/)*. Its 
dimension is — 1 — deg /i, which is the same as the number of integers 
u in the range 0 < u < q'^ — 2 with u = 0, or Wq{u) > m{q — 1) — u — 1. 
In Exercise II below, you will show that this is the same as the dimension 
of RMq{m, i/)* over F^. Hence h{x) is actually the generator polynomial of 
RMq{m, I/)*, and this completes the proof. □ 

From this proposition and Exercise 12 of §3 of this chapter, we deduce 
the following statement. 

(5.15) Corollary. Let v < m{q — 1) and let r, s be the quotient and 
remainder respectively on division of u by q — 1, so that u = r{q — 1) + 5, 
and 0 < s < q — 1. Then the minimum distance d of the punctured Reed- 
Muller code RMq{m,uy is at least {q — s)q^~^~^ - 1. The minimum 
distance of the unpunctured code RMq{m, v) is at least {q — s)q'^~'^~^. 
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Proof. If 0 < Wq{t) < m{q - 1) — u, then substituting u = r{q — 1) + s 
we have Wq{£) < {m - r - l){q - 1) (q - 1 - s). The smallest integer io 
for which Wq{io) = {m — r — l){q — 1) + (q — 1 — s) is 

m—r—2 

io = (« - 1 - + E - 1)^' 

(5.16) 

= {q-l- - 1 

= {q- - 1 

(where the second equality holds because the sum telescopes). By (5.14), 
the consecutive powers a,a^ are all roots of h{x). By Exercise 

12 of § 3 , it follows that the minimum distance is at least io, which implies 
what we wanted to show by the last equality in (5.16). □ 

The lower bound on d is exact in this case. Continuing with the notation 
from (5.15), we have the following observation. 

Exercise 6 . Let i = 1, . . . , s be any s < q — 1 distinct nonzero elements 

of ¥q. 

a. Show that the polynomial 

/(tl, . . . , tm) — (^1 —!)••• — l)(tr-|-l — O-i) • • • (tr+1 “ ^s) 

is an element of which is nonzero at exactly (q — s)q^~^~^ — 1 points 
other than the origin in F^. 

b. What is the variety V = V(f) in F^? (By (5.15), among the varieties 
V(g) for g G V has the largest possible number of F^-rational 
points.) 

c. Show that the minimum distance of the code RMq{m,i')* is exactly 
{q - s)q^-^-'^ - 1 . 

d. Deduce that the minimum distance of the unpunctured code RMq{m^ v) 
is exactly {q — s)q^~~'^~^. 

For example, consider the code RM^{2^ 1), which has block length n = 9 
and dimension A; — 3 over F 3 . By (5.14), the generator polynomial of the 
cyclic code RM^{2^ 1 )* has roots where 0 < ^ < 8 and 0 < w^{i) < 
m{q — 1) — V — 1 = 2(3 - 1 ) — 1 — 1 = 2. This gives ^=1,2, 3, 4, 6 . The 
longest consecutive string has length 4, so the minimum distance is d = 5. 

The minimum distances of the geometric Goppa codes are even more 
delicate. Indeed, there are very few general statements one can make apart 
from a lower bound complementary to the Singleton bound from Exercise 
7 of § 2 . If C is a geometric Goppa code of block length n and dimension k 
from a curve X of genus then 

n +1 — fc — ^<d<n + l — A:. 
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We will not prove this statement here — although the proof is not difficult, 
it uses the Riemann-Roch theorem and the basic fact that the number 
of zeros of a rational function / on a projective curve is the same as the 
number of poles of / (counted with multiplicities). Proofs can be found in 
[Sti], 

The exact determination of d for a geometric Goppa code involves the 
subtle question of how many zeros a function in L can have at the F^- 
rational points on X, so there are both geometric and arithmetic issues 
that must be addressed. We will only present a simple example where the 
geometry suffices to understand what is going on. 

Consider the Hermit ian code over F4 given by the generator matrix in 
(5.2). By construction, the entries in each column give the homogeneous 
coordinates (l,^i,^2) of some F4-rational point on the cubic curve with 
homogeneous equation -f ^2^0 + ^2^0 ~ study the minimum distance, 
we must consider all possible nonzero codewords, so consider a general 
linear combination of the three rows of (5.2). Each entry in such a linear 
combination has the form a-\-bti ct 2 for some a,b,c E F4, and (ti, ^2) C 
X(F4). How many of these can be zero? We claim that for each triple 
(a, 5, c) ^ (0, 0, 0), the equations 

4 " ^2 “b ^2 =0 
a 4- -f ct2 = 0 

have at most 3 common solutions in F4. To see this, note that if b is nonzero, 
we can eliminate t\ between the two equations to yield a cubic in ^2- Sim- 
ilarly, if 5 = 0, but a ^ 0, we can eliminate ^2? yielding a cubic in t\. If 
b = c = 0, then a ^ 0, and we have codeword with all entries nonzero. 
Hence every nonzero codeword has at most 3 zero and at least 5 nonzero 
entries. As a result C is an [8, 3, 5] code over F4. By Proposition (2.1) of 
this chapter, any two errors in a transmitted word can be corrected by 
nearest neighbor decoding. 

Of course, what we have done here is equivalent to applying BezouVs 
Theorem to give an upper bound for the number for the number of zero 
entries in a codeword. Because X is a cubic curve, it meets each line V(a 4- 
bx 4- cy) in at most three points, and this explains our derivation of the fact 
d = 5 above. But this upper bound is often not the best bound possible. 

We will conclude this section with a brief word about why the geometric 
Goppa codes have been such an important development in coding theory. 
Several years after Goppa introduced these codes, a breakthrough was made 
by Tsfasman, Vladut, and Zink (see [TVZ]), who realized that geometric 
Goppa codes can be constructed whose parameters [n. A;, d] improve on the 
Gilbert- Varshamov bound from Exercise 7 of §2 of this chapter. Recall, 
given the parameters n and d for a code, we can ask: How big can k be? 
Let 



Aq{n, d) = max{q^ : there exists a code with parameters [n, k, d]} 
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Each codeword must be the center of a ball of radius d — 1 in the Hamming 
distance that contains no other codewords, so 

(5.17) Aq(n, d) > q^/B{n, d - 1), 

where B{n, d — 1) = (?)(^ “ number of words in a ball of 

radius d — 1 in Writing 6 — d/n, we take logarithms and let n oo 
to obtain 

aq{6) = limsup — log^ Aq{n,6n) 

n-^oo ^ 

the “asymptotic best value” for R = k/n. Using Stirling’s formula, in the 
limit as n — > oo, the bound (5.17) gives an inequality 

aq{6) > 1 - Hq{S) 

where 



H,{6) = 6log^{q - 1) -6\og^{8) - (1 - 8)\og^{l - 6) 

is called the entropy function. This asymptotic form of (5.17) says that the 
best possible code has a rate R > I — Hq{6). In a. Cartesian coordinate 
system, plotting the information rate R — k/n versus the relative minimum 
distance 8 = d/n, the graph R — \ — Hq{8) is decreasing and concave up, 
intersecting the i?-axis at i? == 1 and the (5-axis at (5 == {q — l)/q. On 
the other hand, Tsfasman, Vladut, and Zink showed that it is possible 
to construct a sequence of geometric Goppa codes Ci with n ^ oo from 
a sequence of curves Xi over some fixed such that the corresponding 
points (Ri^Si) converge to a point on the line R 8 = 1 — • For 

q > 49, this line lies above the graph R = 1 — Hq{8) for 8 in some interval. 
Prior to this development many coding theorists apparently believed the 
Gilbert- Varshamov bound was the best possible asymptotic lower bound 
for long codes. But the results of [TVZ] show that codes better than the 
previous best known lower bound must exist (with q and n big). In other 
words geometric Goppa codes are the “current champions” for long codes! 

The original proof of this result uses the sequence of modular curves 
Xq{£) over Fg, another class of curves with many rational points over finite 
fields and possessing a long pedigree in number theory. More elementary 
constructions of families of curves affording codes exceeding the asymptotic 
Gilbert- Varshamov bound have also be constructed recently by Garcia and 
Stichtenoth (see [GS]). 



Additional Exercises for §5 

Exercise 7. Find the roots of the generator polynomial of i?Ms(2,2)*, 
and determine the minimum distance of this code and the unpunctured 
code. 
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Exercise 8. In this exercise, you will complete the proof of Proposition 
(5.6). Let F be an algebraically closed field containing Fg, and consider the 
finite set C F"^. Let I denote the ideal generated by the — U in 
the ring F[ti, . . . , tm]- As in the proof of the lemma, I denotes the ideal 
generated by the — U in Fg[ti, . . . , tm]- 

a. Show that V(7) = FJ^. 

b. Show that 

= = |V(T)|. 

c. Deduce that / is a radical ideal, so that I = I(FJ”). Hint: consult 
Theorem (2.10) from Chapter 2 of this book. Note that that theorem 
was stated for the field C, but the proof relies only on the fact that C 
is algebraically closed so that the Strong Nullstellensatz applies. 

d. Deduce that 



1 1/ n Fg[tl, . . . , tfYl\ ly. 



for all V. 

Exercise 9. Show that if A is the companion matrix of the minimal 

polynomial of a primitive element of ¥qm over ¥q and (p is the mapping 

from (5.5), then for all row vectors x G F^"^, ip(xA) = a(p{x), where the 

right-hand side is the product in Fgm . 

Exercise 10. 

a. Show that the Reed-Muller code RMq{m,u) is invariant under the ac- 
tion of the full affine group Aff(m, Fg), consisting of all transformations 
on row vectors x of the form 

T :¥^ ^ ¥^ 

X xA -f h, 

where A is an invertible mxm matrix with entries in Fg, and b is a row 
vector in ¥^. 

b. Deduce that if RMq{m^v) is punctured in any one position (not just 
the position corresponding to the origin in F^"^), the resulting code is 
equivalent to a cyclic code of block length — 1, 

Exercise 11. 

a. Let d < u < q^ — 1. Show that Wq{u) + Wq{q^ — 1 — u) = m{q — 1). 
Hint: What is the base-g expansion of q^ — 1? 

b. Show that the number of integers u in the range 0 < u < q^ — 2 with 
either u = 0, or Wq{u) > m{q — 1) — u — 1 is the same as the number 
of integers v, 0 < v < q^ — 2 with Wq{v) < v. 

c. Deduce that the number of integers as in part b is the same as the 
dimension of i2Mg(m, z/)*. Hint: Use Exercise 2. 
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Exercise 12. (For readers of Chapter 5) Although it is not exactly obvious 
at first glance from the generator matrix given in (5.2), the codewords of 
the [8, 3, 5] code C over F4 considered there have the property that if the 
blocks of columns (3,5,7) and (4,6,8) are simultaneously permuted by the 
same cyclic permutation, while columns 1,2 are left unchanged, the result 
is another codeword. Let C2 be the second row of the generator matrix. For 
instance, shifting one place to the right within each block, C2 is transformed 
as follows: 

(0 0 1 1 q: a a^) (0 0 1 1 a a). 

The result is • C2 C C. Let S — ^ 4 [t]/{t^ — 1). 

a. Prove the invariance property claimed for the codewords of C. Hint: 
Look at the other two rows of the generator matrix. 

b. Using the invariance property described above and following what we did 
for cyclic codes in the text, find a way to rewrite the codewords of C as 
4-tuples of polynomials in such a way that C becomes a module over the 
ring S (or over F4[t]). Multiplication by t G 5 in your module should be 
the same as the cyclic permutation on blocks of entries described above. 

More generally, any code invariant under a collection of m commuting 
block-wise cyclic permutations as in this example can be considered as a 
module over a polynomial ring in m variables, and systematic encoders 
can be constructed via module Grobner bases (see Chapter 5). Systematic 
encoders for many of the algebraic geometric Goppa codes can be also 
constructed by this method. See [HLS] for more details. 

Exercise 13. This exercise requires some prior knowledge of algebraic 
curves, in particular the Riemann-Roch theorem and how linear systems of 
divisors can be used to define mappings of a curve into projective space. We 
will show how to re-embed the Klein Quartic curve K from Exercise 3 to 
put it in special position, and construct one-point geometric Goppa codes 
by evaluating polynomials. Consider the point Q 2 = {to, h, ^2) = (0, 0, 1) 
on K = V(tf^2 4- ^2^0 + ^o^i) in over Fs. 

a. What are the divisors of the rational functions x = h/to and y = t^/to 
on K? (Note x, y are just the usual affine coordinates in the plane.) 

b. Show that {1, y, xy, is a basis for the vector space L{7Q2) of 

rational functions on K with poles of order < 7 at Q2 and no other 
poles. 

c. Show that the rational mapping defined by the linear system |7Q2| is 
an embedding of K into P^. (Hint: Show that it separates points and 
tangent vectors.) 

d. By part b, in concrete terms, the mapping from c is defined on the curve 
:^by 

-0 : K P^ 

(to,ti^t2) ^ {Uo,Ui,U2,Us,U4) — (^q? ^ 2^0? ^1^2^05 ^2^0? ^1^2)- 
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Show that the image of '0 is the variety k' — 

V{UqUs -f- ^1, U2 + U1U4, UqU2 + 1/3 + U2U4, U1U2U3 4- UqU4 + Uqu\)^ 

a curve of decree 7 and genus 3 in isomorphic to K. 

e. Show that K is in special position. Hint: What is fl Y{uq)1 

f. Using the vector space L\ spanned by the functions I^ui/uq, . . . ,U 4 /uq 
on K , and V = il;{K{¥s)\{Q 2 }).> construct the geometric Goppa code 
C{V, Li). What are the parameters of this code? (In terms of the original 
Klein Quartic, this is the one-point code C{K{¥s)\{Q 2 }, L{7Q2))-) 

Exercise 14. The ideal I = {t^ — U : 1 < i < m) C. ¥q[ti, . . . ,tm] from 
Proposition (5.6) is not homogeneous, so the theory of Hilbert functions 
developed in Chapter 6 does not apply. To remedy this, we introduce the 
homogeneous ideal 

7 = (if - :l<i<m) gR = F,[to, h, . . . , i„]. 

As in Chapter 6, Ijy will denote the vector space of homogeneous 
polynomials of I of degree u. 

a. Consider the map ¥q[to,ti, . . . , tm] defined by 

fih, tm/to). 

Show that the image of this map is /i/, and conclude that the affine 
Hilbert function from Proposition (5.6) equals the Hilbert function 
^R/1 defined in Chapter 6. Hint: Since tj — U is a, Grobner basis 

of /, explain why f ^ lu can be written / = Zllii 9i ' {A ~ 
deg Qi < ly — q. (Another way to prove part a is to show that I is the 
homogenization of 7, as defined in Chapter 8, §4 of [CLO], and then the 
equality of Hilbert functions follows from Theorem 12 of Chapter 9, §3 
of [CLO].) 

b. In order to compute the Hilbert function of R/I by the methods of 
Chapter 6 of this book, we need a free resolution of 7. The resolution we 
will use is the Koszul complex, which is discussed (in a special case) in 
Exercise 10 of Chapter 6, §2. Write out the Koszul complex for the ideal 
7, keeping careful track of the twists used at every step of the complex. 

c. Proving that the Koszul complex of part b gives a resolution of 7 involves 

some advanced concepts (the polynomials t? — form a regular 

sequence, and the Koszul complex of a regular sequence is a resolution — 
see Chapter 17 of [Eis]). Assuming the exactness of the Koszul complex, 
show that this leads directly to the formula (5.7) in the text. 
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367-369, 372, 378 
property of mixed orders, 155 
Elimination property of resultants, 
72, 73 

Elimination Theorem, 15, 16, 25, 37, 
173 

see also Local Elimination 
Theorem 

Emiris, L, x, 122, 124, 129, 305, 324, 
346, 347, 351-353, 355-358, 
469, 470 

encoding, 407, 415, 416, 419, 421, 
424, 432 

function, 415, 430 
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encoding (cont) 

systematic, see systematic, encoder 
engineering, 360 
entropy function, 464 
Equal Ideals Have Equal Varieties 
principle, 20 
equivalent 

codes, 459, 465 
analytic functions, 137, 138 
matrices, 197 
modules, 218, 221 
errata, x 

error, 415, 415, 421, 445 
locations, 436, 439, 446 
locator polynomial, 437, 438, 446, 
447 

polynomial, 436, 437, 447 
values, 436, 446 

error-correcting, 415, 417-419, 424, 
435, 436, 445-447, 463 
burst, 426, 435 
codes, 407, 415ff 

error-detecting, 415-417, 419, 424 
Euclidean algorithm, 76, 436, 448 
Euclidean geometry, 170 
Euler characteristic, 404 
Euler’s formula, 106, 107 
evaluation mapping, 449, 450, 452, 
457, 458 

Ewald, G., 316, 320, 323, 346, 470 
exact sequence, 234ff, 246, 247, 

251, 261, 262, 264, 265, 267, 
275-277, 285, 383, 403, 404, 
453, 456, 467 
split, 245 

exponent vector of a monomial, see 
monomial, exponent vector of 
Extension Theorem, 25, 27, 28, 57 
extraneous factor, 99ff 

¥q, see affine, n-dimensional space 
over ¥q 

face 

of a polytope, 292, 293, 318, 333, 
344, 345, 349, 389 
ring, 406 
facet 



of a polytope, 293, 294, 309, 316, 
318-320, 324, 326, 332 
variable, 309, 312, 313, 316 
Farin, G., 385, 470 
Faires, J., ix, 32, 55, 469 
Faugere, J., 46, 56, 443, 470 
feasible region, 360, 361, 364, 370, 
371, 374, 376 

Fenchel, W., ix, 316, 320, 323, 469 
FGLM algorithm, see Grobner basis 
conversion 

field 

algebraically closed, see 
algebraically closed field 
automorphism, 414 
finite, see finite field 
isomorphism of, 408, 411, 413, 414 
of rational functions, see rational 
function field 
prime, see prime field 
residue, see residue field 
finite-dimensional algebra, see 
algebra 

finite element method, 385 
finite field, 104, 208, 220, 407ff, 415ff, 
424ff, 436ff, 449ff 

finite free resolution, 239-242, 244, 
245, 247, 249 
graded, 257, 270 
minimal, 260 
see also free resolution 
Finite Generation of Invariants, 281 
finite group, 281, 283, 284, 289, 407 
abelian, 412 

finite matrix group, see finite group 
Finiteness Theorem, 37, 41, 51, 58, 
276 

in module case, 210, 440 
finitely generated module, 183, 192, 
196, 226, 228, 230-232, 237, 
245, 246, 256, 257, 260, 266, 
267, 269, 270, 282 
First Fundamental Theorem of 
Invariant Theory, 92 
First Isomorphism Theorem, see 
Fundamental Theorem of 
Homomorphisms 
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first syzygy module, 189, 239-241, 
248, 266 

see also syzygy module 
Fitting 

equivalent matrices, 197 
ideal, see Fitting, invariant 
invariant (Fi(M)), 230, 232, 233 
Fitzpatrick, R, x, 436, 440, 443, 444, 
470 

fixed point iteration, 33 
Fogarty, J., 312, 470 
follow generically, 173 
see also generic 

Formal Implicit Function Theorem, 
136 

formal infinite product, 410 
formal power series, 288, 403, 409, 
437 

ring {k[[xi, . . . , Xn]]), 133ff, 146, 
156, 162, 165, 175, 176, 222, 
226, 

Forney formula, 439, 446 
free module (R^), 180ff, 192, 197ff, 
231-233, 236ff, 245ff, 395, 396, 
400, 401, 403, 404 
graded, see graded free module 
over a local ring, 222ff 
twisted, see graded free module 
free resolution, viii, 208, 234, 239ff, 
245ff, 252, 284, 285, 383, 467 
finite, see finite free resolution 
graded, see graded, resolution 
isomorphism of, see isomorphic 
resolutions 

partial, see partial graded 
resolution 

trivial, see trivial resolution 
Friedberg, S., 149, 471 
Frobenius automorphism, 415, 435, 
461 

Fulton, W., 307, 316, 323, 336, 346, 
375, 471 

function field, see rational function, 
field 

fundamental lattice poly tope, 319, 
320, 323, 325 
Fundamental Theorem 
of Algebra, 331 



of Homomorphisms, 52, 141, 414 
on Discrete Subgroups of Euclidean 
Space, 319 

Galois, E., 27 
Galois group, 415 
Galois theory, 407, 415, 435 
Garcia, A., 464, 471 
Garrity, T., x, 102, 468 
Gatermann, K., 340, 475 
Gauss-Jordan elimination, 422 
GCD, 39, 40, 46, 69, 101, 198, 273, 
275, 276, 279, 285, 286, 289, 
348, 352, 357, 436, 448 
Gelfand, L, 76, 80, 87, 89, 94, 103, 
301, 302, 304, 307, 308, 316, 

336, 342, 343, 353, 471 
generalized characteristic polynomial, 

105, 115, 356 
toric, 356 
generalized 

companion matrix, 128 
eigenspace, see eigenspace, 
generalized 

eigenvector, see eigenvector, 
generalized 

generating function, 288, 409, 410, 
456 
generator 

matrix (of a linear code), 416-419, 
421, 424-426, 432, 450, 463, 
466 

polynomial (of a cyclic code), 425- 
428, 431-436, 447, 459-462, 
464 

generic, 108-110, 116, 118, 121, 122, 
125, 279, 328-330, 335-337, 
340, 342, 346, 354, 401 
linear subspace, 270 
polynomial in L{A)^ 296, 
system of equations, 330, 331, 334, 

337, 353 

number of solutions, 332 
genus, 452-454, 462, 467 
geometric Goppa code, 407, 448ff 
one-point, 466 
geometric modeling, 305 
geometric series 
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geometric series {cont) 

formal summation formula, 134, 
403, 409, 410, 437 
finite summation formula, 411 
Geometry Center at the University 
of Minnesota, 348 
Georg, K., 338, 468 
germ of an analytic function, 138 
getform (Maple procedure), 68, 69 
getmatrix (Maple procedure), 63, 
68, 73, 92 

Gianni, R, 46, 56, 443, 470 
Gilbert- Varshamov bound, 420, 463, 
464 

GL(2,R), 282, 283, 289 
GL(m,i?), 197, 281, 284 
Goppa, V., 448, 449, 452, 463 
Goppa code, see geometric Goppa 
code 

Gorenstein codimension 3 ideal, 288 

Grabe, H., 158, 173, 471 

graded 

free resolution, see graded, 
resolution 

graded resolution, 252, 257ff, 266, 
268ff, 285-288 

homomorphism, 255flF, 268, 281 
isomorphism, 259, 262 
matrix, 256 

minimal resolution, see minimal, 
graded resolution 

module, 208, 209, 241, 253ff, 266ff, 
283, 402-404 

resolution, 252, 257ff, 266, 268flF, 
285-288 

submodule, 254, 261 
subring, 281 

twisted module, see twist of a 
graded module M 
graded free module, 254ff, 404 
standard structure (R^), 254 
twisted structure (R{di) 0 • • • 0 
R{dm)), 255ff, 268ff, 283ff 
Graded Hilbert Syzygy Theorem, 

see Hilbert Syzygy Theorem, 
Graded 

graded reverse lexicographic order, 
see grevlex 



graph, 398 
connected, 398, 400 
dual, 398, 400 

edge of, see edge, of a graph 
oriented edge of, see edge, oriented 
vertex of, see vertex, of a graph 
Grassmann, H., 158, 471 
greatest common divisor, see GCD 
Green, M., 273 
Greuel, G.-M., x, 158, 471 
grevlex, 8, 10, 15, 36, 38, 42, 46, 47, 
56, 57, 68, 205, 208-210, 241, 
242, 247, 249 
grlex, 209 

Grobner basis, vii-ix, 12, 15, 16, 25, 
27, 28, 34, 36, 40, 48, 49, 56, 
58, 60, 68, 76, 81, 121, 138, 
151, 163, 165-167, 169, 176, 
179, 208, 238, 240, 274, 278, 
328-330, 338, 375, 377-380, 
382, 384, 429-431, 457, 467 
in integer programming, 359, 365ff 
for modules, 197, 200, 204ff, 210ff, 
245ff, 257, 385. 386, 394-398, 
401, 436, 439-444, 446-448, 
466 

for modules over local rings, 222ff 
monic, 15, 51, 204, 207, 250; see 
also Uniqueness of Monic 
Grobner Bases 

reduced, 15, 37, 42, 62, 204, 207, 
214, 247, 253, 254, 328, 369, 
430, 432, 441, 445 
software for linear programming, 
373 

specialization of, 328 
see also standard basis 
Grobner basis conversion, 46ff, 56, 
443, 444 
Main Loop, 46ff 
Next Monomial, 47ff 
Termination Test, 47ff 
group 

affine, see affine, group 
cyclic, see cyclic, group 
finite, see finite group 
finite matrix, see finite group 
Galois, see Galois group 
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invariant theory of, see invariant 
theory, of finite groups 
multiplicative, see multiplicative 
group 

Gusein-Zade, S., 178, 468 

Half-space, 361 

Hall’s marriage theorem, 385 

Hamming 

code, 418, 419, 421-424 
distance, 417, 464 
Hasse-Weil 
bound, 454, 455 
Theorem, 452, 453 
Heegard, C., 436, 449, 458, 471 
heptagon, 316, 324 
hereditary complex, 389-391, 393, 
395, 398-401 
Hermitian 

curve, 454, 457 
code, 463 

Herstein, L, 27, 54, 65, 412, 471 
Herzog, J., 285, 469 
hidden variables, 115ff, 120, 121, 128, 
145 

higher syzygies, 239, 240, 243 
see also syzygy module 
Hilbert, D., ix, 245, 263, 280, 471 
Grobner basis algorithm for, 377, 
378 

Hilbert basis, 377-380, 384, 385 
Hilbert Basis Theorem, 4, 12, 165, 
175 

Hilbert-Burch Theorem, 248, 252, 
264, 271, 279, 280, 286 
Hilbert function, viii, ix, 175, 208, 
266ff, 282, 285-288, 375, 381, 
382, 404, 455, 457, 467 
affine, 455-457, 467 
Hilbert polynomial, 264, 266, 270ff, 
285-288, 453, 457 

Hilbert series, 282, 288, 289, 403, 404 
Hilbert Syzygy Theorem, 245, 247, 
257 

Graded, 257, 259, 260, 263, 280, 
284 

Hironaka, H., 169, 471 
Hpholt, T., 449, 471 



hold generically, 329 
see also generic 

homogeneous coordinates, 85, 88, 90, 
402, 463 

homogeneous element of a graded 
module, 253, 256 

homogeneous ideal, 240, 252, 253, 
266, 270, 276, 287, 288, 369, 
381, 451, 453, 467 

homogeneous polynomial, 2, 74, 78ff, 
89ff, 120, 159, 164, 166, 176, 
177, 208, 209, 253, 254, 256, 
263, 267, 276, 277, 281, 283, 
286, 302, 314, 321, 322, 347, 
401, 402 

weighted, see weighted 

homogeneous polynomial 
with respect to nonstandard 
degrees, 369, 370 

homogeneous syzygy, 211, 221, 283 
homogenize, 158, 160, 166, 176, 276, 
279, 286, 299, 343, 402, 467 
a resolution, 258, 263 
in the sparse or toric context, 309ff 
homomorphism 

Fundamental Theorem of, see 
Fundamental Theorem of 
Homomorphisms 

graded, see graded, homomorphism 
localization of, see localization, of 
a homomorphism 
module, see module 
homomorphism 

ring, see ring, homormorphism of 
homotopy continuation method, 327, 
338-340 

homotopy system, 338 
Huber, B., 324, 336, 337, 340, 346, 
347, 471 

Huguet, F., 414, 473 
Huneke, C., 273, 284, 470 
hyperplane, 319, 323, 325 
at infinity, 90, 454 
hypersurface, 64 

Ideal, 3, 182, 425, 426, 429-433, 455, 
465, 467 

basis of, see basis of an ideal 
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ideal {cont.) 

comaximal, see comaximal ideals 
elimination, see elimination, ideal 
generated by 

Gorenstein codimension 3, see 
Gorenstein codimension 3 
ideal 

homogeneous, see homogeneous 
ideal 

intersection, see intersection, of 
ideals 

maximal, see maximal ideal 
monomial, see monomial ideal 
of a variety (1(F)), 20 
of ith minors {Ii{A)), 229, 230, 
233, 266 

of leading terms ((lt(/))), 12, 36, 
37, 151, 164, 165, 168, 169, 
175, 176, 430, 456; see also 
module, of leading terms 
primary, see primary, ideal 
primary decomposition of, see 
primary, decomposition 
prime, see prime, ideal 
principal, see principal ideal 
product, see product, of ideals 
quotient, see quotient ideal 
radical, see radical, ideal 
radical of, see radical, of an ideal 
sum, see sum, of ideals 
variety of, see variety, of an ideal 
zero-dimensional, see 

zero-dimensional ideal 
Ideal- Variety Correspondence, 21-23 
image, 180, 185, 190, 192, 234, 235, 
237, 240, 243, 258ff, 404 
Implicit Function Theorem, 338 
implicitization, 80, 81, 273, 278, 298, 
299, 303, 305 

inclusion-exclusion principle, 456 
information 

positions, 419, 430, 432 
rate, 420, 464 
initial value problem, 339 
injective, 235, 242, 260, 286, 313, 
341, 382, 414, 415 
inner product, 423, 433, 461 



Insel, A., 149, 471 
Integer Polynomial property of 
resultants, 72, 73 

integer programming, viii-ix, 359ff, 
374, 384 

integer programming problem, 360ff, 
376 

standard form of, see standard 
form 

Grobner basis algorithm for 
solving, 368, 372 
integer row operation, 341 
integral domain, 136 
interior cell of a polyhedral 

complex, see cell, interior, 
of a polyhedral complex 
Intermediate Value Theorem, 31 
interpolation, 60, 62, 63, 105, 127 
dense, 106 

Lagrange interpolation formula, 

41, 44 

probabilistic, 106 
sparse, 106 
Vandermonde, 106 
intersection 

of ideals (7 n J), 6, 16, 22, 218-220 
of submodules {N D M), 191, 218, 
219, 242, 243 

of varieties (V Pi W), 18, 22 
interval arithmetic, 63 
invariant of a group action, 281-283 
invariant theory 

First Fundamental Theorem of, see 
First Fundamental Theorem 
of Invariant Theory 
of finite groups, 266, 280ff, 365 
relation to combinatorics, 375 
Second Fundamental Theorem 
of, see Second Fundamental 
Theorem of Invariant Theory 
inversion problem, 300 
invertible matrix with entries in a 
ring R, 195 
see also GL(m, R) 
inward normal, see inward pointing 
normal 

inward pointing facet normal, see 
inward pointing normal 
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inward pointing normal, 293, 294, 
309-311, 316, 322, 324, 325, 
333, 336 

primitive, 294, 309, 319, 326, 331, 
332 

irreducible 

component of a variety, 86, 170, 
173 

factor, 100, 144 

polynomial, 80, 84, 86, 99, 149, 
150, 301, 302, 342, 407-412, 
414, 459 

variety, 22, 23, 86, 88 
isobaric polynomial, 100, 107, 108 
isolated singularity, see singularity, 
isolated 

isolated solution, 139 

see also singularity, isolated 
isomorphic resolutions, 259, 262 
isomorphism 

field, see field, isomorphism of 
graded, see graded, isomorphism 
module, see module isomorphism 
of resolutions, see isomorphic 
resolutions 

ring, see ring, isomorphsim of 
see also First, Second and Third 
Isomorpism Theorems 

Jacobian 

matrix or determinant, 82, 150 
variety, 449 
Jacobson, N., 430, 471 
Jouanolou, J., 89, 90, 92, 94, 95, 471 
Julia set, 34 

see affine, n-dimensional space 
over k 

Kaltofen, E., 104, 469 
Kapranov, M., 76, 80, 87, 89, 93, 94, 
103, 301, 302, 304, 307, 308, 
316, 336, 342, 343, 353, 471 
kbasis (Maple procedure), 44, 45, 57 
kernel, 180, 181, 185, 188, 192-194, 
233-235, 237, 239, 240, 243, 
259ff, 376-379, 394-396, 404, 
405, 414, 456, 457 

key equation, 438, 439, 441-445, 447 



Khovanskii, A.G., 335, 337, 472 

Kirwan, F., 145, 472 

Klein Quartic curve, 454, 466, 467 

Klimaszewski, K., 274, 474 

Koblitz, N., 415, 472 

Koszul 

complex, 252, 275, 285, 286, 456, 
467 

relations, 228 
Kozen, D., 67, 468 
Kushnirenko, A.G., 335, 472 

L{A), 296, 300, 306, 316, 329, 330, 
353, 356 

see also Laurent polynomial 
Lagrange’s Theorem, 412, 427 
Lakshman, Y., 104, 469 
lattice, 319, 321 

lattice polytope, 291, 294-296, 304, 
318-321, 324 
Laurent 

monomial, 296, 333, 341 
polynomial {k[xf ^, . . . , xj^]), 296, 
298, 300, 327, 329-331, 337, 
342, 350, 353, 354, 356, 359, 
371, 377 
see also L{A) 

Lazard, D., 46, 56, 158, 167, 175, 
443, 470 

LCM, 13, 198-200, 205, 211, 212, 435 
leading 

coefficient (lc(/)), 154, 155, 202 
monomial (lm(/)), 154, 155, 202 
term (lt(/)), 9, 151, 154-156, 
159ff, 164, 166, 167, 202ff, 

21 Iff, 246, 441 

least common multiple, see LCM 
Leichtweiss, K., 316, 320, 323, 472 
length of a finite free resolution, 239 
level curve, 362 

/ex, 8, 10, 15, 16, 25, 27, 38, 46ff, 56, 
57, 62, 121, 138, 151, 202-204, 
209, 214, 238, 328, 366-369, 
457 

lexicographic order, see lex 
Li, T., 337, 472 

lift of a polytope Q (Q), 346, 347 
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line at infinity, see hyperplane, at 
infinity 

linear code, 407, 416-424, 427, 429, 
430, 432, 435, 436, 449 
linear programming, 359, 362, 373 
linear system, 449, 466 
linearly in dependent elements in a 
module, 185, 186 
see also free module 
Little, J., vii-ix, 4, 9, 10, 12-15, 20, 
21, 23, 25, 34, 35, 37, 39, 49, 
52, 72, 73, 81, 86, 145, 149, 
166, 167, 169-171, 173, 174, 
176, 177, 199, 202, 203, 212, 
221, 253, 268, 270, 273, 282, 
284, 285, 289, 308, 328, 340, 
365, 366, 368, 369, 375, 381, 
382, 402, 414, 425, 429, 431, 
448, 453, 455, 456, 458, 467, 
470, 471 

Local Division Algorithm, see 

Division Algorithm, in a local 
ring 

Local Elimination Theorem, 174 
local intersection multiplicity, 139 
local order, 152ff, 168, 169, 173 
for a module over a local ring, 223 
local ring, viii, 130, 132ff, 138ff, 164ff, 
169, 170, 175, 222ff, 249 
localization, 131ff, 139, 146, 156, 175, 
182, 222, 223, 234, 247 
at a point p, 222, 223 
of an exact sequence, 251 
of a homomorphism, 251 
of a module, 222, 223, 250, 251 
with respect to > 

(Loc>(A;[xi, . . . ,Xn])), 154ff, 
159, 162 165, 166 
Logar, A., 187, 472 
logarithmic derivative, 410 
Loustaunau, R, vii, 4, 9, 12-15, 21, 
25, 37, 174, 201, 202, 359, 468 
lower facet, 346, 347, 350 

p-basis, 275, 276, 278, 279, 283, 289 
p-constant, 178 

see also Milnor number 
/i*-constant, 178 



see also Tessier /x*-invariant 
Macaulay^ 39, 206-209, 219, 220, 240, 
256, 258, 366, 373, 379, 403, 
432 

commands, 208 
hilb, 403 
mat, 208 
nres, 258 
pres, 240 
putmat, 220 
putstd, 209 
ring, 208 
<ring script, 208 
res, 240, 258 
std, 209 
syz, 219 

Macaulay, F., ix, 101, 102, 472 
MacMahon, R, 384 
MacWilliams, F., 415, 427, 472 
magic squares, 313, 375ff 
symmetric, 385 

Manocha, D., 104-106, 114, 115, 119, 
128, 129, 356, 469, 472 
Maple, 10, 14, 15, 18, 21, 25, 28, 36, 
38-40, 44, 54, 56-58, 63, 68, 
73, 108, 114, 120, 144, 167, 
206, 208, 328, 373, 412, 413, 
422, 431, 432, 445 
alias, 413, 422, 431 
array, 422 
charpoly, 68 
collect, 431 
Domains package, 431 
eigenvals, 58 
expand, 30, 45 
Expand, 431 

factor, 26, 27, 108, 144 
finduni, 39, 44, 45, 46 
fsolve, 28, 30, 31, 114, 120 
Gaussjord, 422 
gbasis, 14, 15, 25, 27, 44, 57 
grobner package, 10, 14, 39, 432 
implicitplotSd, 18, 21 
minpoly, 54 
mod, 413, 422, 446 
normal, 413 
Normal, 413, 445, 446 
normalf , 10, 36, 45, 68 
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Rem, 431, 445 
resultant, 73 

RootOf , 27, 108, 413, 422, 431 
simplify, 40, 46 
solve, 27 

subs, 26, 28, 445, 446 
see also getform, getmatrix, 
kbasis, zdimradical 
Mariner Mars exploration craft, 450 
Martin, B., 158, 471 
Mathematica, 56, 167, 206, 373 
maximal ideal, 5, 131ff, 139, 142, 
225, 226, 228, 231, 407 
maximum distance separable code, 
427 

m-dimensional cyclic code, see cyclic, 
m-dimensional code 
MDS code, see maximum distance 
separable code 
Melancholia, 375 
metric, 417 
Meyer, F., 280, 472 
Milnor, J., 147, 472 
Milnor number (/x), 138, 147, 150, 
168, 170, 177, 178 
minimal 

element (with respect to a 
monomial order), 442-445, 
448 

graded resolution, 258ff, 283 
number of generators of a module 
(/x(M)), 224ff 

polynomial (of a linear map), 53, 
54, 61 

polynomial (of a primitive 
element), 414, 435, 459, 

465 

presentation, 227-230 
resolution, see minimal, graded 
resolution 

set of generators of a submodule, 
188, 224-230, 234, 258, 259 
set of generators of a submonoid, 
377 

minimum distance (of a code), 

418-420, 422, 424, 427-429, 
434-436, 450, 455, 458, 459, 
461-464 



relative, 464 

Minkowski sum of polytopes (P + Q), 
317ff, 331, 343ff 
coherent, 349, 350, 357 
minor of a matrix, 229, 230, 233, 248, 
249, 266, 279, 287, 288, 314, 
434 

Mishra, B., 63, 472 
mixed cell of a mixed subdivision, 
345, 346, 351, 352, 354 
mixed elimination-weight monomial 
order, 373 

mixed monomial basis, see monomial 
basis, mixed 

mixed order, 154, 155, 172 
mixed sparse resultant, see resultant, 
mixed sparse 

mixed subdivision, 324, 342, 344ff 
coherent, 347-352, 354, 358 
mixed volume, viii, 316, 322-324, 
326, 327, 330-332, 334-337, 
340-343, 345, 346, 351-355, 
357 

computing, 324, 345, 346 
normalized, 323, 332, 334, 341 
Mobius inversion formula, 410 
modular curve, 464 
module, viii, ix, 179ff, 234ff, 245ff, 
252, 281, 282, 383, 394, 399, 
400, 439, 440, 445, 448, 453, 
446 

basis of, 186, 188, 395; see also free 
module 

direct sum of, see direct sum of 
modules 

equivalent, see equivalent, modules 
finitely generated, see finitely 
generated module 
free, see free module 
graded, see graded module 
graded free, see graded free module 
Gronber basis of, see Grdbner 
basis, for modules 
homomorphism of, see module 
homomorphism 
isomorphism, see module 
isomorphism 
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module {cont.) 

localization of, see localization, of 
a module 

of leading terms ((lt(M))), 204, 
210, 404, 440, 441 
over a local ring, 222ff, 241 
presentation of, see presentation of 
a module 

projective, see projective, module 
quotient, see quotient module 
quotient of, see quotient of modules 
rank of, see rank, of a module 
syzygy, see syzygy module 
twisted free, see graded free 
module, twisted structure 
module homomorphism, 183, 184, 

192, 194, 195, 197, 234ff, 251, 
265, 394, 399, 400 
graded, see graded, homomorphism 
localization of, see localization, of 
a homomorphism 
hom(M, AT), 192 

module isomorphism, 185, 190, 192, 
195, 217, 218, 229, 235, 262, 
265, 282, 286, 399 
graded, see graded, isomorphism 
molecular structure, 305 
Molien series, 281, 283, 284, 289 
Molien’s Theorem, 284 
Moller, H., 55, 472 
monic polynomial, 409-411, 414 
monoid, 377 

monomial, 1, 406, 452, 457 
exponent vector of, 290-292, 350, 
364, 368 

Laurent, see Laurent monomial 
weighted, 289 

relation to integer programming, 
364-366 

monomial in a free module, 198ff 
belongs to / G 198 
contains a basis vector, 198 
quotient of monomials, 198 
monomial basis, 138, 426 
determined by a monomial order, 
see basis monomials 
for generic polynomials, 122, 126 
mixed, 354-356 



monomial ideal, 12, 151, 199 
monomial order, 7, 12, 56, 159, 164, 
365, 367, 368, 370, 372, 382 
adapted, 367, 370, 372, 429, 430, 
440-442, 456 
in 197, 200ff 
see also ordering monomials 
monomial submodule, 199, 206, 207 
Mora, T., 46, 56, 156, 160, 169, 173, 
175, 443, 468, 470, 472 
Mora’s algorithm for modules, 224 
Mora Normal Form Algorithm, 159ff, 
165ff 

Moreno, C., 449, 453, 472 
moving 

hyperplane, 280, 287 
line, 274 

line basis, see fi-hasis 
multidegree (multideg(/)), 154, 155 
multiplication map (m/), 51ff, 58ff, 
74, 77, 78, 91, 96, 112, 116- 
118, 125, 126, 143, 144, 148, 
149, 356, 429 

multiplicative group, 408, 412, 413 
multiplicity, 42, 65, 66, 112, 115, 119, 
130, 138flF, 168, 170, 177, 249, 
329, 330, 340, 353, 354, 463 
multipolynomial resultant, see 

resultant, multipolynomial 
multivariable factorization, 100, 113 
multivariate splines, 385, 389 
Mumford, D., 312, 470 

Nakayama’s Lemma, 225, 226 
National Science Foundation, x 
nearest neighbor decoding function, 
418, 419, 421, 463 
Neumann, W., 158, 471 
Newton polytope of / (NP(/)), 295, 
296, 298, 316-318, 320, 321, 
324, 329, 330, 332, 333, 336, 
337, 340, 343, 353 
Newton-Raphson method, 28, 29, 
32-34, 56, 63, 339 
Newton’s method, see 

Newton-Raphson method 
Noetherian ring, 189, 196, 197 
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noncollinear points, 271, 272, 278, 
279 

non-generic equations, 114, 115, 128, 
129 

non-mixed cell in a mixed 
subdivision, 346 
see also mixed cell 
(non-standard) degree, 369 
nonstandard monomial, 383, 430, 431 
see also standard monomial 
nontrivial solution, 75, 79, 80, 85, 87, 
309 

in sparse context, 311, 312 
nontrivial splines, 391, 397 
normal form, 13 
see also remainder 
normal vector, 362 

see also inward pointing normal 
normalized volume, see volume of a 
poly tope, normalized 
NP-complete decision problem, 347, 
363 

Nullstellensatz, see Strong 
Nullstellensatz 
number theory, 464 
numerical methods, 28, 32, 56, 60, 
338, 339 

aecumulated errors, 57 
homotopy continuation, see 
homotopy continuation 
method 

Newton-Raphson, see 

Newton-Raphson method 
root-finding, 327 

Octahedron, 362, 374, 403 
one-point code, 458, 467 

see also geometric Goppa code, 
one-point 

one-to-one, see injective 
one variable splines, see univariate 
splines 

onto, see surjective 
operations research, 360 
optimization, 361 
order 

of a pole, see pole, order of 
of a polynomial g (ord(5f)), 163 



ordering monomials 
>r, 440-446, 448 
adapted, see monomial order, 
adapted 
alex, see alex 
anti-graded, see degree- 
anticompatible 
order 

arevleXj see arevlex 
degree-anticompatible, see 

degree-anticompatible order 
grevlex, see grevlex 
in see monomial order, in 
lex, see lex 
local, see local order 
mixed, see mixed order 
monomial, see monomial order 
POT, see POT order 
product, see product order 
semigroup, see semigroup, order 
TOP, see TOP order 
weight, see weight order 
ordinary differential equations 
(ODEs), 339 

ordinary double point, see singularity, 
ordinary double point 
oriented edge, see edge, oriented 
orthogonal subspace, 423 
O’Shea, D., vii-ix, 4, 9, 10, 12-15, 
20, 21, 23, 25, 34, 35, 37, 39, 
49, 52, 72, 73, 81, 86, 145, 149, 
166, 167, 169-171, 173, 174, 
176, 177, 199, 202, 203, 212, 
221, 253, 268, 270, 273, 282, 
284, 285, 289, 308, 328, 340, 
365, 366, 368, 369, 375, 381, 
382, 402, 414, 425, 429, 431, 
448, 453, 455, 456, 467, 470 
Ostebee, A., x 
outward normal, 294 

see also inward pointing normal 

#P-complete enumerative problem, 
347 

P’^, see projective n-dimensional 
space 

pallet, 359, 360 
parallelogram, 170, 171 
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parameters of a code ([n, fc, d]), 419, 
420, 423, 434, 448, 455, 463, 
467 

parametrization, 76, 80, 81, 87, 132, 
133, 266, 273ff, 285, 286, 299, 
300, 305 

parametrized solution, 335, 338, 339 
parity check 

matrix, 416, 417, 419, 422-424, 434 
position, 419, 430, 431 
Park, H., 187, 472 
partial differential equations, 385 
partial graded resolution, 263 
partial solution, 25 
partition (of an interval), 388, 389 
Pedersen, R, x, 63, 64, 66, 67, 122, 
304, 347, 353, 473 
Peitgen, H.-O., 32, 34, 473 
Pellikaan, R., 449, 471 
perfect code, 419, 424 
permutation matrix, 379-381, 384, 
385 

Pfister, A., 158, 169, 471 
Pfister, G., 158, 471 
PID, 5, 39 

piecewise polynomial function, 359, 
385-388, 390, 404 
Pohl, W., 158, 471 
Poisson formula, 92 
pole, 451, 452, 454, 463, 466 
order of, 451, 452, 454, 466 
Poli, A., 414, 473 

polyhedral complex, 389-391, 393, 
398, 402, 403, 405 
cell of, see cell, of a polyhedral 
complex 

dimension of, see dimension, of a 
polyhedral complex 
edge of, see edge of a polyhedral 
complex 

hereditary, see hereditary complex 
pure, see pure complex 
pure, hereditary, see pure, 
hereditary complex 
simplicial, see simplicial complex 
vertex of, see vertex, of a 
polyhedral complex 



polyhedral region, 359, 361, 373, 374, 
376, 389 

unbounded, 361, 376 
polyhedral subdivision, 344 

see also mixed subdivision and 
polyhedral complex 
polyhedron, 291 
polynomial, 2 

relation to polytopes, 290, 294ff 
polynomial ring (k[xi, ^ Xn]), 2, 
165, 222, 227, 231, 233, 234, 
359, 364 

polynomial splines, see splines 
polytope, viii, ix, 290ff, 316ff, 327, 
336, 340, 345, 346, 389 
convex, see poly tope 
dimension of, see dimension, of a 
polytope 

edge of, see edge, of a polytope 
face of, see face, of a polytope 
facet of, see facet, of a polytope 
fundamental lattice, see 

fundamental lattice polytope 
lattice, see lattice polytope 
lift of, see lift of a polytope 
Minokowski sum of, see Minkowski 
sum of polytopes 
Newton, see Newton polytope 
relation to homogenizing sparse 
equations, 309, 313 
relation to integer programming, 
359, 361 

relation to sparse resultants, 301ff, 
343, 344 

supporting hyperplane of, see 
supporting, hyperplane 
surface area of, see surface area of 
a polytope 

vertex of, see vertex, of a polytope 
volume of, see volume of a poly tope 
“position-over-term”, see POT order 
POT order, 2011F, 219, 223, 441 
power method, see eigenvalues 
presentation matrix, 189AF, 197, 222, 
223, 226ff, 237 
ofhom(M,AT), 192 
see also minimal, presentation 
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presentation of a module, 234, 
236-238, 243, 246 
primary 

decomposition, 144, 149 
ideal, 149 
prime 
field, 407 
ideal, 5, 23, 136 
primitive 

element (of a field extension), 412, 
414, 426, 431, 434, 436, 445, 
450, 459, 460, 465, 
inward pointing normal, see inward 
pointing normal, primitive 
vector in 333, 340, 341 
principal axis theorem, 65 
principal ideal, 5, 247, 425 
Principal Ideal Domain (PID), see 
PID 
prism, 326 
product 

of ideals, 22 
of rings, 44, 141, 149 
product order, 15, 174, 372 
projection map, 86, 88, 119 
projective 

closure, 451, 454 
curve, 451, 453, 454, 457, 463 
dimension of a module, 249 
embedding, 452 
module, 194, 230-233, 245 
n-dimensional space (P’^), 85, 89, 
108, 119, 252, 266, 270, 307ff, 
402, 466 
Projective 

Extension Theorem, 86, 119 
Projective Images principle, 308, 
312 

Projective Varieties in Affine Space 
principle, 90 

projective variety, 18, 85, 90, 252 
affine part (C’^ H V), 90 
part at infinity (P’^"^ H V), 90 
Puiseux expansion, 331 
punctured codes, 450, 451 
pure, hereditary complex, 391, 392, 
394, 396, 397, 402, 405 
pure complex, 389, 391, 398 



qhull, 348, 350 
Qi, D., 274, 474 
QR algorithm, 56 
quadratic formula, 109 
quadratic splines, 387, 388, 400 
Quillen, D., 187, 194, 231, 473 
Quillen-Suslin Theorem, 187, 194, 
231, 232 

quintic del Pezzo surface, 288 
quotient ideal (/: J), 6, 23, 221 
see also stable quotient of I with 
respect to / 

quotient module (M/AT), 183, 190, 
206, 210, 225ff, 235, 244, 440, 
443 

graded, 254 

quotient of modules (M:iV), 193 
quotient ring {k[xi, Xn]/ 1), 35, 
36, 130, 182, 264, 268, 269, 
272, 273, 276, 278, 279, 285, 
289, 354, 356, 375, 381, 383, 
384, 406-408, 414, 425, 427- 
434, 440, 436, 453, 457, 465, 
467 

see affine, n-dimensional space 
over R 

radical 

ideal, 4, 5, 22, 23, 39, 43, 44, 58, 
59, 62, 121, 465 
membership test, 171 
of an ideal I (\/7), 4, 5, 20, 39, 40, 
171 

Radical Ideal Property, 4 
Raimondo, M., 169, 173, 175, 468 
rank 

of a matrix, 65, 69, 232, 233 
of a module, 232 

rational function, 2, 131, 252, 437, 
451, 452, 454, 458, 463, 466 
field {k{xi , . . . , Xn)), 252, 264, 327, 
328 

field of transcendence degree one, 
449, 452, 454 

field of a curve, see rational 
function, field of 
transcendence degree one 
rational mapping, 288, 466 
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rational point over (X(F^)), 451, 
452, 454, 455, 457, 462-464, 
rational 

normal curve, 266, 287 
normal scroll, 287 
quart ic curve, 240, 286 
received word, 436, 438, 445-447 
recurrence relation, 437, 447 
REDUCE, 39, 167, 206, 240 
CALI package, 167, 206, 219, 224, 
240, 241 

reduced Grobner basis, 5ee Grobner 
basis, reduced 
reduced monomial, 101, 102 
in toric context, 315 
reducible variety, 171 
reduction of / by p (Red(/, ^)), 156, 
157, 160, 161, 163, 164 
Reed-Muller code, 407, 449ff 
binary, 450 
once-punctured, 458 
punctured, 458, 459, 461 
unpunctured, 461, 462, 464 
Reed-Solomon 

code, 426-428, 431-436, 444, 447, 
449, 450, 452, 460 
decoding, 435ff, 449 
extended, 449 
Rege, A., 122, 470 
regular sequence, 467 
Reif, J., 67, 468 
Reisner, G., 406 

relations among generators, 188, 189, 
193 

see also syzygy 

remainder, 9, 10, 12, 35, 365-367, 
373, 382, 408, 425, 429-432, 
436, 443, 447, 448, 458, 461 
in local case, 159ff, 165, 166, 168, 
172, 173 

in module case, 202-204, 206, 212, 
214 

remainder arithmetic, 35 
Uniqueness of Remainders, 12 
removable singularity, 33 
residue classes, 226 
see also coset 

residue field (k = Q/m), 226 



resolution 

finite free, see finite free resolution 
free, see free resolution 
graded, see graded, resolution 
graded free, see graded, free 
resolution 

homogenization of, see 

homogenize, a resolution 
isomorphsim of, see isomorphic 
resolutions 

length of, see length of a finite free 
resolution 

minimal, see minimal, resolution 
minimal graded, see minimal, 
graded resolution 
partial graded, see partial graded 
resolution 

trivial, see trivial resolution 
resultant, vii-ix, 71, 290 

A-resultant, see resultant, sparse 
and multiplicities, 144, 145 
computing, 96ff, 342ff 
dense, 302 

geometric meaning, 85ff 
mixed, see resultant, mixed sparse 
mixed sparse, 342, 343, 347, 348, 
351, 353, 354, 357 
multipolynomial, viii, 71, 78ff, 
298-302, 304, 307, 308, 330, 
343, 347, 348, 353 
properties of, 80, 89ff 
of two polynomials, 71flF 
solving equations via, see solving 
polynomial equations 
sparse, viii, 82, 106, 129, 298ff, 
306ff, 330, 342ff 

Sylvester determinant for, 71, 278 
u-resultant, see u-resultant 
(unmixed) sparse resultant, 342 
revenue function, 360, 362, 367, 368 
Richter, R, 32, 34, 473 
Riemann-Roch Theorem, 449, 458, 
463, 466 

Riemann Hypothesis, 452 
ring 

Cohen-Macaulay, see 

Cohen- Macaulay ring 
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commutative, see commutative 
ring 

convergent power series, see 

convergent power series ring 
formal power series, see formal 
power series ring 
homomorphism of, 44, 52, 117, 
120, 141, 149, 275, 281, 364, 
367, 371, 377, 378, 406, 414 
integral domain, see integral 
domain 

isomorphism of, 44, 137, 149, 150, 
406, 430, 433 

localization of, see localization 
Noetherian, see Noetherian ring 
of invariants (5^), 281, 283, 284 
polynomial, see polynomial ring 
product of, see product of rings 
quotient, see quotient ring 
valuation, see valuation ring 
see free module 
i?-module, see module 
Robbiano, L., 153, 473 
Rojas, J.M., X, 308, 336, 342, 337, 
356, 473 

Rose, L., 385, 386, 395, 398-401, 469, 
473 

row operation, 195, 196, 230 

integer, see integer row operation 
row reduction, 422 
Roy, M.-F., 63, 64, 66, 67, 473 
Ruffini, R, 27 

Saints, K., 436, 449, 458, 471 
Saito, T., 274, 474 
Sakata, S., 436, 473 
Salmon, G., 83, 103, 473 
Schenk, H., 386, 401, 473 
Schmale, W., x 

Schreyer, F.-O., 211, 212, 273, 284, 
473, 474 

Schreyer’s Theorem, 212, 224, 238, 
240, 245flF, 257, 395 
Schrijver, A., 359, 474 
Schonemann, H., 158, 471 
Second Fundamental Theorem of 
Invariant Theory, 93 
Second Isomorphism Theorem, 193 



second syzygies, 234, 238ff, 284 
see also syzygy module 
Sederberg, T., 274, 278-280, 470, 474 
Segre map, 309 
semigroup 

of pole orders, 451, 452, 454 
order, 1521T, 164-166, 173 
Serre, J.-R, 187, 474 
Serre’s conjecture, see Serre problem 
Serre problem, 187, 194, 231 
see also Quillen-Suslin Theorem 
Shafarevich, L, 86, 88, 90, 91, 94, 474 
Shannon’s Theorem, 420 
Shape Lemma, 62 

shifted module, see twist of a graded 
module M 

shipping problem, 359, 360, 363, 365, 
367, 368, 370 
Shurman, J., x 
Siebert, T., 158, 471 
signature of a symmetric bilinear 
form over R, 65, 66, 69 
simple root, 33 

simplex, 292, 294, 324, 330, 347 
simplicial complex, 401, 404 
Singer, M., x 

Singleton bound, 420, 427, 462 
Singular, 39, 158, 167, 168, 172, 178, 
206, 209, 210, 219, 224, 240, 
241, 249, 366, 370, 373, 432 
homepage, 167 

ideal, 167, 168, 172, 242, 250, 370 

milnor package, 178 

module, 210 

poly, 172, 370 

reduce, 172, 370 

res, 242 

ring, 167, 172, 209, 219, 242, 250, 
370 

sres, 249, 250 
std, 370 
syz, 219 
vdim, 168 
vector, 209, 210 
singular point, see singularity 
singular curve, 454 
singularity, 130, 147, 148, 451 
isolated, 147, 148, 177, 178 
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singularity {cont.) 

ordinary double point, 150 
skew-symmetric matrix, 288 
slack variable, 363-367 
Sloane, N., 415, 427, 472 
smooth curve, 451, 453, 454, 455, 457 
solutions at oo, 108, 112, 115, 144, 
329, 330 

see also projective variety, part at 
infinity 

solving polynomial equations, viii-ix, 
46, 60, 327 

determinant formulas, 71, 83, 103, 
119 

generic number of solutions, 
real solutions, 63ff, 67 
via eigenvalues, viii, 51, 54ff, 122ff, 
354 

via eigenvectors, 59ff, 127, 354 
via elimination, 24ff, 56 
via homotopy continuation, 

see homotopy continuation 
method 

via resultants, viii, ix, 71, 108ff, 
114ff, 305, 329, 338, 342, 353ff 
via toric varieties, 313 
see also numerical methods 
sparse Bezout’s Theorem, see 
Bernstein’s Theorem 
sparse polynomial, 298, 329, 330 
see also L{A) 

sparse resultant, see resultant, sparse 
sparse system of equations, 339, 340 
see also sparse polynomial 
special position, 451, 452, 454, 467 
spectral theorem, 65 
Speder, J.-R, 178 
Spence, L., 149, 471 
splines, viii, ix, 385ff 

bivariate, see bivariate splines 
see splines 
cubic, see cubic, spline 
multivariate, see multivariate 
splines 

nontrivial, see nontrivial splines 
one variable, see univariate splines 
quadratic, see quadratic splines 
trivial, see trivial splines 



univariate, see univariate splines 
see also piecewise polynomial 
functions 

split exact sequence, see ex£u:t 
sequence, split 

S-polynomial, 13, 14, 165, 166, 207, 
211, 366 

square-free part of p (pred)^ 39, 40, 
149 

stable quotient of I with respect to / 
175 

standard basis 

of an ideal, viii, 164ff, 175-177 
of a module over a local ring, 223, 
224, 441 

of 186, 198, 199, 207-209, 219, 
220, 236, 242, 246, 250, 255, 
258, 259, 261, 294 
standard form 

of an integer programming 
problem, 364 

of a polynomial, 382, 425; see also 
remainder 

standard monomial, 36, 382-384, 
430, 431 

basis, see basis monomials 
see also basis monomials 
standard representative, see standard 
form, of a polynomial 
Stanfield, M., x 

Stanley, R., 283, 284, 375, 384, 405, 
406, 474 

start system, 338, 339 
dense, 339, 340 
statistics, 375, 384 
Stetter, H., 55, 129, 468, 474 
Stichtenoth, H., 449, 452, 453, 458, 
463, 464, 471, 474 
Stillman, M., 386, 401, 473 
Stirling’s formula, 464 
Strang, G., 401, 404 
Strong Nullstellensatz, 21, 41, 43, 54, 
62, 83, 99, 141, 148, 149, 171, 
465 

Sturmfels, B., ix, x, 82, 92, 93, 119, 
122, 187, 284, 304, 306-308, 
324, 336, 337, 340, 343, 346, 
347, 353, 354, 356, 359, 375, 




Index 497 



377, 378, 383, 384, 386, 401, 
469-474 

Sturm sequence, 70 
Sturm’s Theorem, 69, 70 
^'-vector (5(f,g)), 204-206, 212-214 
over a local ring, 223 
subdivision 

algebraic (non-polyhedral) , 404, 
405 

coherent, see mixed subdivision, 
coherent 

mixed, see mixed subdivision 
polyhedral, see polyhedral complex 
and polyhedral subdivison 
subfield, 408, 409, 413 
submatrix, 426 

submodule, 180ff, 197ff, 217, 224, 
228, 239, 244246, 250, 254, 
262, 399, 440-442 
generated by elements of 182, 
217 

graded, see graded, submodule 
Submodule Membership problem, 
197 

submonoid, 377 
subpolytope, 326 
subring 

generated by /i,...,/n 
365 

membership test, 365, 371, 372 
sum 

of ideals (/ + J), 6, 22 
of submodules (iV + M), 191, 242, 
243 

supporting 

hyperplane of a polytope, 293, 294, 
309, 319, 320, 325 
line, 362 

surface area of a polytope, 326 
surjective, 235, 245, 265, 281, 312, 
313, 365, 382, 414 
Suslin, A., 187, 194, 231, 474 
Sweedler, M., x 
symbolic 

computation, 56, 104, 113, 352 
parameter, 327 

Sylvester determinant, see resultant, 
Sylvester determinant for 



symmetric 

bilinear form, 64, 65 
magic square, see magic square, 
symmetric 

syndrome, 421, 424, 436 
decoding, 421, 435 
polynomial, 437, 438, 440, 445, 446 
systematic 

encoder, 419, 430, 431, 466 
generator matrix, 419, 422, 423 
syzygy, 175, 189, 199, 200, 210ff, 234, 
236ff, 271, 272, 277, 280, 282, 
283, 395, 398, 399 
homogeneous, see homogeneous 
syzygy 

over a local ring, 223, 224, 228, 230 
syzygy module (Syz (/i, . . . , /t)), 
189, 190, 194, 197, 199, 208, 
21 Iff, 231-233, 234, 236ff, 
245ff, 257, 263, 264, 271, 
275-277, 279, 280, 286, 395, 
397, 398, 403 
Syzygy Problem, 197, 210 
Syzygy Theorem, see Hilbert Syzygy 
Theorem 

Szpirglas, A., 63, 64, 66, 67, 473 

Tangent cone, 170, 177, 178 
Taylor series, 138 
term, 2, 74 

term orders (in local case), 151ff 
see also ordering monomials 
“term-over-position”, see TOP order 
ternary quadric, 82, 83 
Tessier /i* -invariant, 178 
tetrahedron, 403 

Third Isomorphism Theorem, 193 
third syzygy, 234, 238-240 
see also syzygy module 
Thomas, R., 359, 474 
Tjurina number, 138, 148, 168, 170, 
177 

“top down” ordering, 201, 209 
TOP order, 201ff, 223, 441 
toric 

GCP, see generalized characteristic 
polynomial, toric 
ideal, 383 
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toric (cont) 

variety, viii, 305, 306, 307fF, 331, 
336, 359, 375, 383, 384 
total degree of a polynomial / 

(deg(/)), 2, 74, 163, 455, 456, 
460 

total ordering, 7, 152, 153, 201, 369 
trace of a matrix M (Tr(M)), 64, 66, 
translate of a subset in 297, 298, 
320, 321, 326 

transpose of a matrix M (M^), 125, 
126, 182, 356, 423 

Traverso, C., 158, 169, 359, 363, 372, 
373, 470, 472 
tree, 400 

trivial resolution, 263 
trivial splines, 395, 397, 398, 400 
trucking firm, 359, 360 
Tsfasman, M., 448, 463, 464, 474 
twist of a graded module M (M(d)), 
255, 267 
twisted 

Chow form, see Chow form, 
twisted 

cubic, 247, 269, 289 

free module, see graded free 
module 

Unbounded polyhedral region, see 
polyhedral region, unbounded 
unimodular row, 187, 194, 233 
union of varieties (U UV), 19, 22 
Uniqueness of Monic Grobner Bases, 
15 

Uniqueness of Remainders, see 
remainder 

unit cube in R’^, 292 
unit of a ring, 131, 132, 137, 159, 
162, 163, 225 

univariate splines, 385, 386, 388. 390 
universal categorical quotient, 312 
“universal” polynomials, 85, 95, 101, 
102 

University of Kaiserslautern, 167 
University of Minnesota, 348 
unpunctured code, see Reed-Muller 
code, unpunctured 
unshort enable, 224, 225 



u-resultant, llOff, 118, 122, 128, 144, 
353, 354, 356 

Valuation ring, 135, 136 
van der Geer, G., 449, 475 
van der Waerden, B., 73, 80, 110, 474 
Vandermonde determinant, 426, 434 
van Lint, J., 415, 420, 427, 449, 475 
Varchenko, V., 178, 468 
variety, 18, 130, 177, 274, 451, 462 
affine, see affine, variety 
degree, see degree, of a variety 
dimension of, see dimension, of a 
variety 

ideal of, see ideal, of a variety 
irreducible, see irreducible, variety 
irreducible component of, see 
irreducible, component of a 
variety 

Jacobian, see Jacobian, variety 
of an ideal (V(/)), 20 
projective, see projective variety 
reducible, see reducible variety 
toric, see toric, variety 
vector f (in a free module), 180ff, 236 
Verlinden, R, 338, 340, 475 
Veronese 

map, 308, 313 
surface, 287 

Verschelde, J., 338, 340, 471, 475 
vision, 305 

Vladut, S., 448, 463, 464, 474 
vertex 

of a graph, 398 

of a polyhedral complex, 389, 390, 
403-405 

of a polytope, 292-294, 311, 326, 
349, 352, 362, 374 
vertex monomial, 311, 312, 316 
volume of a polytope Q (Voln(Q)), 
292, 319-324, 326, 327 
normalized, 319-322, 326, 352 
relation to sparse resultants, 
302-304 

relation to mixed volume, 322-324, 
326, 327, 345, 346 

Wang, X., 337, 342, 472, 473 
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Warren, J., 102, 468 
web site, x, 167 
Weierstrass 

Gap Theorem, 452, 454 
point, 454 
weight, 454 

weight (of a variable), 100, 107, 108, 
282 

weight order, 15, 368, 372, 440 
weighted degree of a monomial, 289 
weighted homogeneous polynomial, 
178, 282, 289 

Weispfenning, V., vii, 4, 9, 12-15, 25, 
37, 51, 63, 69, 174, 468 
well-ordering, 7, 152, 155, 201, 369 
Weyman, J., 343, 475 
White, J., X 

Wilkinson, J., 30, 31, 475 
Woodburn, C., x, 187, 472 



words, see codewords 
Yoke, 189 

Zalgaller, V., 316, 320, 323, 469 
Zariski closure, 23, 301, 302, 308, 
313, 342, 383 

zdimradical (Maple procedure), 45 
Zelevinski, A., 76, 80, 87, 89, 93, 94, 
103, 119, 301, 302, 304, 307, 
308, 316, 336, 342, 343, 353, 
471, 474 

zero-dimensional ideal, 37ff, 41, 

44-46, 48, 51, 53, 55, 59, 61, 
62, 64, 65, 130, 138ff, 149-151, 
176 

Zero Function, 17 
Ziegler, G., 291, 475 
Zink, T., 448, 463, 464, 474 
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